0001001: Need a way to set up redundant OCSP responders - CAcert Bug Tracker

ID	Project	Category	View Status	Date Submitted	Last Update

0001001	Main CAcert Website	misc	public	2011-12-14 20:57	2012-12-30 08:22

Reporter	NEOatNHNG	Assigned To
Priority	normal	Severity	major	Reproducibility	always
Status	new	Resolution	open
Platform	Main CAcert Website	OS	N/A	OS Version	stable

Summary	0001001: Need a way to set up redundant OCSP responders
Description	Some organisations have offered to host an OCSP server on their infrastructure to overcome some stability problems. Problem is we can't give them an OCSP cert (they could then produce valid responses for any serial number even if not valid/revoked in our system). A possibility would be to build some kind of caching mechanism that doesn't need an OCSP cert because it just gets the responses from the main server. Unfortunately there seems no existing software to do this so we probably have to implement it ourselves.
Additional Information	Possible architecture: - Master OCSP server runs a daemon which sends signed OCSP responses for all serial numbers valid in the system to known slave servers - If asked for known serial numbers slaves answer with the cached response - If asked for unknown serial number slaves ask master * If serial not valid, master answers with a signed response and slaves cache that response for a limited time (maybe even more sophisticated: if revoked cache for long time, if only unknown only cache for short time) and send the response to the client * If serial valid, slave caches that response as usual and sends response to client * If master unreachable or indicates failure indicate failure to the client There is also an IETF draft but it never made it towards standardisation before it expired https://tools.ietf.org/html/draft-ietf-pkix-ocsp-caching-00
Tags	No tags attached.

Reviewed by
Test Instructions

Sourcerer 2011-12-17 17:43 administrator ~0002746	The problem that I see with this idea is the traffic it takes. At the moment, CAcert has 700 thousand issued certificates, so we can approximate 1 million certs. An OCSP response takes approximately 2 KB, so we have 2 GB of traffic that we would want to distribute regularly (daily? hourly? in case of emergency?) to the OCSP caches? If we do it daily, then we have 60 GB of traffic per OCSP responder. I think having "trusted" OCSP servers where CAcert is sure that they are operated properly, is the better way, than to run untrusted OCSP caches. My suggestion is to setup several OCSP servers on various trusted places, automatically and manually monitor them, whether they are behaving properly, and (semi-)automatically pulling their plug through DNS if they misbehave.

NEOatNHNG 2011-12-17 18:59 administrator ~0002747	We only need to distribute OCSP responses for _valid_ certs not for all issued certs, which is 82,807 certs at the moment (one order of magnitude less) and likely to stay more or less the same (grows with the number of users not with time). Also there are some ways to make distribution more efficient (e.g. compress multiple responses together in one package). AFAIK the responder we have in mind right now even resides in the same data center as our servers. I don't know about our contracts but this might be even cheaper, at least faster. Having trusted sites will need serious consideration. OCSP certs can not be revoked in practice and DNS changes might take a while to propagate. Apart from that DNS was not built for high security applications as some vulnerabilities have shown in the past. DNSSEC might improve that but it's not widely implemented yet and the issue of stub resolvers is not handled.

NEOatNHNG 2011-12-17 19:00 administrator ~0002748	P.S.: ideally we didn't have to do caching on our side at all and every server would support OCSP stapling, but I guess that's not gonna happen anytime soon.

Sourcerer 2011-12-18 01:33 administrator ~0002749	Having several OCSP responders in the same datacenter is not helping much. We need OCSP responders in different places, at least different countries, preferrably different continents. If one datacenter or even a whole country has an outage, CAcert would jeopardize all CAcert certificate users on the rest of the planet, which need the availability of OCSP, by not having enough OCSP servers in other areas. CAcert should provide a high-available OCSP infrastructure on a global scale.

NEOatNHNG 2011-12-18 02:27 administrator ~0002750	The main problems today are due to crashes of the responder software or machine. In this concrete cases a second responder even in the same data center would improve the situation a lot. For disaster recovery and latency it might indeed be a good idea to spread them around the world but this is more a "what happens if" matter not "what is most annoying right now". For a global spread we could try to scale up the design proposed in the description, then we would likely need some additional efficiency improvements (e.g. only distribute responses for certs where there was at least one request in some fixed time span, to only catch certs in active use) or indeed operate multiple full-blown OCSP responders, this would probably be quite costly because the environment needs to be secured. Question is how do these full-blown responders get to know which certs are valid? If this would require an active connection to the database it would be as bad as a local solution. If they take a CRL there's problems if someone just issues certs above our current serial number range because they can't know what is the highest valid number and so on (this is what the current responder does IIRC). One possible issue with the caching approach I have not looked into yet: the OCSP request allows for an extension which includes a nonce with the request and the response needs to also contain this nonce. I don't know how many clients actually do this as then all sorts of caching (including OCSP stapling) would not work any more. If it's a high percentage we might need to go for the full-blown OCSP responder approach anyway.

NEOatNHNG 2012-02-08 23:02 administrator ~0002828	Maybe we could also use standard P2P techniques to distribute the traffic between the caching OCSP responders

CookieEater 2012-12-30 08:22 reporter ~0003581	Maybe http://www.ejbca.org/installation-ocsp.html shows a possible solution. Every OCSP responder should have its own certificate. If something goes wrong at one responder, its certificate will be revoked. Of couse if this happens, this is a very uncomfortable situation, so protection of OCSPs private key is really an issue. Each responder have its own database, which is of course synced with master server, so responder works, even if master is temporally down.

Date Modified	Username	Field	Change
2011-12-14 20:57	NEOatNHNG	New Issue
2011-12-17 17:43	Sourcerer	Note Added: 0002746
2011-12-17 18:59	NEOatNHNG	Note Added: 0002747
2011-12-17 19:00	NEOatNHNG	Note Added: 0002748
2011-12-18 01:33	Sourcerer	Note Added: 0002749
2011-12-18 02:27	NEOatNHNG	Note Added: 0002750
2012-02-08 23:02	NEOatNHNG	Note Added: 0002828
2012-12-02 16:38	INOPIAE	Relationship added	related to 0001119
2012-12-30 08:22	CookieEater	Note Added: 0003581