Skip to main content

Crashing Firefox with Regular Expression

Recently, I have found an interesting crash in Firefox and decided to investigate more. So I decided to Google for it and it appears that the issue is already known and was reported few months ago to Mozilla.
However, the bug is not fixed yet (at least in FF 26) and as a matter of personal exercise, I have decided to dig a little deeper and collect some notes which I am sharing in this blog post.
Here is a brief analysis of what I have found, thanks also to the pointers given from my friend Andrzej Dereszowski.

This is the crash PoC:

<html>
<head>

<script>
function main() {
regexp = /(?!Z)r{2147483647,}M\d/;
"A".match(regexp);
}

main();
</script>
</head>
<body>
</body>
</html>


Below, a windbg screen shot showing the crash on Firefox 25 / Windows 8.1 (64bit):

 

At this stage, we can infer that an overflow occurred and as a measure of protection FF decided to crash instead of gracefully handle the issue. In my PoC, you can see already the integer 2147483647 which is used in a regular expression.

In the call stack, there are functions dealing with the RegExp just before the mozjs!WTF::CrashOnOverflow::overflowed: . Let's put a breakpoint on the previous function: mozjs!JSC::Yarr::YarrGenerator<1>::generatePatternCharacterFixed+0x87 and see what happens just before the overflow is identified.

This is the function where we are setting the breakpoint (bp) on:

void generatePatternCharacterFixed(size_t opIndex)
    {
        YarrOp& op = m_ops[opIndex];
        PatternTerm* term = op.m_term;
        UChar ch = term->patternCharacter;

        const RegisterID character = regT0;
        const RegisterID countRegister = regT1;

        move(index, countRegister);
        sub32(Imm32(term->quantityCount.unsafeGet()), countRegister);

        Label loop(this);
        BaseIndex address(input, countRegister, m_charScale, (Checked<int>(term->inputPosition - m_checked + Checked<int64_t>(term->quantityCount)) * static_cast<int>(m_charSize == Char8 ? sizeof(char) : sizeof(UChar))).unsafeGet());

The bp is set on the BaseIndex address() part. This is where some checks are performed on our integer.

After stepping through different checks, our integer (2147483647) is stored in both lhs and rhs and then lhs and rhs are summed together. The sum is then stored in the "result" variable, as shown below:



The addition of lhs and rhs is 4294967294 (0xFFFFFFFE) which is stored in an int64. Following that, a further check is performed, as shown below:

 template <typename U> Checked(const Checked<U, OverflowHandler>& rhs)
        : OverflowHandler(rhs)
    {
        if (!isInBounds<T>(rhs.m_value))
            this->overflowed();
        m_value = static_cast<T>(rhs.m_value);
    }
 
Within the isInBounds check (in the screen shot below), the minimum value is 0x80000000 and the maximum value is 0x7FFFFFFF, which means between -2147483648 and 2147483647, the range of a long.



The rhs.m_value is now 4294967294 (0xFFFFFFFE) as result of the previous arithmetic operation between lhs and rhs.



This triggers the check as 0xFFFFFFFE is greater than 0x7FFFFFFF (max value in the inBounds check). This would call overflowed() which would then simply crash FF.

Comments

Popular posts from this blog

TrendMicro ScanMail for Microsoft Exchange (SMEX) predictable session token - CVE-2015-3326

It's time for another advisory ( CVE-2015-3326 ), a simple one, for a vulnerability which can be found quickly and trivially. For those of you who just want to give a glance at the post, I suggest to directly watch the picture which says it all! The following vulnerability was discovered on TrendMicro SMEX (ScanMail for Microsoft Exchange) 10 SP2 but it affects other versions as well. While surfing the SMEX web administrative interface using a web proxy, I have noticed something in the HTTP request - the session token itself and its format, a number. After observing a significant number of logins, the session token was always represented with an number composed of minimum 4 digits and maximum 5 digits, as shown in the screen shot below:   Although the observed session tokens were never generated sequentially, the lack of a cryptographically strong PRNG for the session identifier, allows a malicious user to trivially guess the token. This attack can be easily ...

Alcatel Lucent Omnivista or: How I learned GIOP and gained Unauthenticated Remote Code Execution (CVE-2016-9796)

It is time for another advisory or better a blog post about Alcatel Lucent Omnivista  and its vulnerabilities. Omnivista is a central management network tool and it is typically used in medium/large organisation with a complex VoIP/SIP infrastructure. Interestingly enough, this software belongs to the niche of "undownloadable" software and it requires a license to work as well. My "luck" came during an engagement where it was already installed and this post documents one of the many 0days discovered during such audit. The reasons why I wanted to dedicate a single blog post on this vulnerability are several. First, remote code execution (RCE) is always a sweet bug to show. Second, I strongly believe that documenting vulnerabilities in applications using old protocols and standards, respectively GIOP and CORBA, can be beneficial for the infosec community, since no many examples of vulnerabilities in such applications are available or published on the Interne...

Microsoft .NET MVC ReDoS (Denial of Service) Vulnerability - CVE-2015-2526 (MS15-101)

Microsoft released a security bulletin ( MS15-101 ) describing a .NET MVC Denial of Service vulnerability ( CVE-2015-2526 ) that I reported back in April. This blog post analyses the vulnerability in details, starting from the theory and then providing a PoC exploit against a MVC web application developed with Visual Studio 2013. For those of you who want to see the bug, you can directly skip to the last part of this post or watch the video directly... ;-) A bit of theory The .NET framework (4.5 tested version) uses backtracking regular expression matcher when performing a match against an expression. Backtracking is based on the NFA (non-deterministic finite automata) algorithm engine which is designed to validate all input states. By providing an “evil” regex expression – an expression for which the engine can be forced to calculate an exponential number of states - it is possible to force the engine to calculate an exponential number of states, leading to a condition defined su...