Archives

Categories

Valgrind and OpenSSL

I’ve just filed Debian bug report #534534 about Valgrind/Helgrind reporting “Possible data race during write” [1]. I included a patch that seems to fix that problem (by checking whether a variable is not zero before setting it to zero). But on further testing with Valgrind 3.4.1 (backported from Debian/Unstable) it seems that my patch is not worth using, I expect that Valgrind related patches won’t be accepted into the Lenny version of OpenSSL.

I would appreciate suggestions on how to fix this, the problem is basically having a single static variable that is initialised to the value 1 but set to 0 the first time one of the malloc functions is called. Using a lock for this is not desirable as it will add overhead to every malloc operation. However without the lock it does seem possible to have a race condition if one thread calls CRYPTO_set_mem_functions() and then before that operation is finished a time slice is given to a thread that is allocating memory. So in spite of the overhead I guess that using a lock is the right thing to do.

deb http://www.coker.com.au lenny gcc

For the convenience of anyone who is testing these things on Debian and wants to use the latest valgrind, the above Debian repository has Valgrind 3.4.1 and a build of GCC to fix the problem I mentioned in my previous blog post about Valgrind [2].

if (default_RSA_meth == NULL)
default_RSA_meth=RSA_PKCS1_SSLeay();

I have also filed bug #534656 about another reported race condition in the OpenSSL libraries [3]. Above is the code in question (with some C preprocessor stuff removed). This seems likely to be a problem on an architecture for which assignment of a pointer is not an atomic operation, I don’t know if we even have any architectures that work in such a way.

static void impl_check(void)   {
        CRYPTO_w_lock(CRYPTO_LOCK_EX_DATA);
        if(!impl)
                impl = &impl_default;
        CRYPTO_w_unlock(CRYPTO_LOCK_EX_DATA);
}
#define IMPL_CHECK if(!impl) impl_check();

A similar issue is my bug report bug #534683 [4] which is due to a similar issue with the above code. If the macro is changed to just call impl_check() then the problem will go away, but at some performance cost.

I filed bug report #534685 about a similar issue with the EX_DATA_CHECK macro [5].

I filed bug report #534687 about some code that has CRYPTO_w_lock(CRYPTO_LOCK_EX_DATA); before it [6], so it seems that the code may be safe and it may be an issue with how Valgrind recognises problems (maybe a Valgrind bug or an issue with how Valgrind interprets what the OpenSSL code is doing). Valgrind 3.3.1 reported many more issues that were similar to this, so it appears that version 3.4.1 improved the analysis of this but didn’t do quite enough.

I filed bug report #534706 about the cleanse_ctr global variable that is used as a source of pseudo-randomness for the OPENSSL_cleanse() function without locking [7]. It seems that they have the idea that memset() is not adequate for clearing memory. Does anyone know of a good research paper about recovering the contents of memory after memset()? I doubt that we need such things.

I filed bug report #534699 about what appears to be a potential race condition in int_new_ex_data() [8]. The def_get_class() function obtains a lock before returning a pointer to a member of a hash table. It seems possible for an item to be deleted from the hash table (and it’s memory freed) after def_get_class() has returned the pointed but before int_new_ex_data() accesses the memory in question.

I filed bug report #534889 about int_free_ex_data() and int_new_ex_data() which call def_get_class() before obtaining a lock and then use the data returned from that function in a locked area[9] (it seems that obtaining the lock earlier would solve this).

I filed bug report #534892 about another piece of code which would have a race condition if pointer assignment isn’t atomic, this time in err_fns_check() [10]. In my first pass I didn’t bother filing bug reports about most of the issues helgrind raised with the error handling code (there were so many that I just hoped that there was some subtle locking involved that eluded helgrind and my brief scan of the source). But a new entry in my core file collection suggests that this may be a problem area for my code.

I think that it is fairly important to get security related libraries to be clean for use with valgrind and other debugging tools – if only to allow better debugging of the code that calls them. I would appreciate any assistance that people can offer in terms of fixing these problems. I know that there are security risks in terms of changing code in such important libraries, but there are also risks in leaving potential race conditions in such code.

As an aside, I’ve filed a wishlist bug report #534695 requesting that valgrind would have a feature to automatically add entries to the suppressions file [11]. As a function that is considered to be unsafe can be called from different contexts, and code that is considered unsafe can be in a macro that is called from multiple functions there can be many different suppressions needed. Pasting them all into the suppressions file is tedious.

8 comments to Valgrind and OpenSSL

  • The last time someone used Valgrid against openssl a huge bug hit the debian community
    I hope this will not the casa :)

  • You could use atomic operations instead of locks, though not all processors provide atomic operations.

  • nine

    hmm, debian, openssl, valgrind…. Be careful what you patch!

  • imajica

    Howdy Russ, I need you to consider the following carefully.
    Can you see a scenario where default_RSA_meth should be null?
    From open-ssl(0.9.8k):

    #ifdef RSA_NULL
    default_RSA_meth=RSA_null_method();
    #else
    #if 0 /* was: #ifdef RSAref */
    default_RSA_meth=RSA_PKCS1_RSAref();
    #else
    default_RSA_meth=RSA_PKCS1_SSLeay();

    My point is there are other exchange modes we want openssl to support … history shows us that locking out options is what gets projects into trouble.
    I also feel the need to say the this;
    Crypto hackers MUST NOT *MUST NOT* allow the “just make it work mentality” to survive. Further, we must allow and encourage support for the adoption of new PKI within existing systems … I know everyone loves RSA I know this point of view makes me unpopular, so let me just say that public RSA itself is technology from 1978 (created off work that was at best initiated in 1973) the updates to the algorithm have been largely based on “boundary limit” raising … the warning is that D/H work shows us that with this kind of hashing the larger the boundary of the data set the higher the probability of collision. These are very real problems now, and while yes there is a lot of work being done to create better exchange systems we desperately need as many options on the table as we can comfortably and rationally support. In this case I think our debian comrades have a larger decision to address. To move away from a “just make it practical” approach (which works very well in other software) to a “trust the design goals of the upstream providers” in regards to complex crypto wrappers.

    Let’s discuss it over a jar or two soon.

    ALC

  • etbe

    Marco and nine: I was thinking of exactly that when I wrote “I know that there are security risks in terms of changing code”. I deliberately made no direct mention to the past problem in question because I want to keep discussion focussed on how to go forwards not on mistakes that were made in the past.

    glandium: What do we have to do on i386 and AMD64 to enable such atomic operations? I doubt that anyone will be doing new deployments of intensive multi-threaded SSL programs on any other architecture nowadays.

    imajica: The fact that there is a check for NULL suggests that it is possible.

    I agree that we need to allow support for other algorithms, but I’m sure you agree that we can support them without making our code non-thread-safe! I resisted the obvious temptation to suggest that defaults be hard-coded – about half the bugs I reported could have been resolved by writing static configuration with no option to change certain parameters.

    If you want to meet up some time then suggest a time and a place. You know roughly where I live so I’m sure you can find a bar or restaurant that is convenient for both of us.

  • The easiest way to get atomic operations is to borrow working code from an existing project, isn’t it?

    E.g. http://www.boost.org/doc/libs/1_39_0/boost/interprocess/detail/atomic.hpp

  • gebi

    > What do we have to do on i386 and AMD64 to enable such atomic operations? I doubt that anyone will be doing new deployments of intensive multi-threaded SSL programs on any other architecture nowadays.

    hm?

    The other way arround. There are way better plattforms for crypto out there.
    eg. sparc T2, tile64[0], mips octeon. …
    Not to forget the big ppc and itanium systems.

    only recently the x86 plattform became initial support for usable crypto (in terms of performance) with intels nehalem plattform which has hardware aes support[1].

    [0]: http://www.tilera.com/pdf/ProductBrief_TILEPro64_Web_v2.pdf
    [1]: http://softwarecommunity.intel.com/isn/downloads/intelavx/AES-Instructions-Set_WP.pdf

  • etbe

    gebi: How do they compare to via padlock and other cheap hardware that is designed for such things?

    Also I expect that if someone wanted to port SSL code to a GPU then performance would be pretty good. The FPGA devices in high-end SGI machines should also do well. Of course that would take a moderate amount of work, with hardware being cheap nowadays it’s often more cost effective to just buy bigger machines.

    But if you want to rent some servers in a hosting center, or run some virtual machines on EC2, Linode, Slicehost, etc then you get none of that. It’s the AMD64 ISA and nothing else.