Uploaded image for project: 'Product Roadmaps'
  1. Product Roadmaps
  2. MMF-1800

SonarSecurity detects Python XSS vulnerabilities involving DTL and Jinja2 Template Engines

    XMLWordPrintable

    Details

    • Type: MMF
    • Status: Closed
    • Priority: Major
    • Resolution: Fixed
    • Labels:

      Description

      WHY

      Cross-site Scripting (XSS) has been identified as the most common vulnerability fixed by open-source Python developers.
      In order to be strong on the Python SAST market, we need to be able to detect trivial and more complex XSS vulnerabilities.

      WHAT

      Django and Flask, the 2 main web frameworks used by Python developers, rely on two Templating Engines to manage the "V" part of the MVC pattern:

      These 2 frameworks rely on .html files that are augmented with a special syntax looking like pseudo-code. DLT and Jinja2 share the same syntax with some subtleties in the behavior. These subtleties will be ignored for the first implementation and the purpose of detecting XSS vulnerabilities. The main syntax are:

      • Variables: "{{ variable }}"
      • Filters: "{{ name|lower }}"
      • Tags: " {% tag %}

        "

      Despite the fact that these files are .html files, they should be considered as part of the execution flow because it's only when the tainted data will be output that we can say if there is a vulnerability or not.

      The goal of this MMF is to cover the use case of a Python web apps made 100% with Django or Flask and not relying a JS frameworks for the UI part. The case of a web app made of JS/Python will be covered in a second step.

      HOW

      Sources for Django and Flask

      Sources are defined in the scope of MMF-1879 and MMF-1815.

      Autoescaping configuration in Django

      Escaping sanitizes the input, autoescaping allows to do it implicitly whenever any value is used in template. Django project by default enables autoescaping. Still it can be controlled through project settings, in file settings.py (related part of such file, doc):

      TEMPLATES = [
          {
              'BACKEND': 'django.template.backends.django.DjangoTemplates',
              'DIRS': [],
              'APP_DIRS': True,
              'OPTIONS': {
                  'autoescape': False, # deactivates autoescaping; can be set to True or be absent to activate it
                  'context_processors': [
                  ],
              },
          },
      ]
      

      We should read this value to know if project autoescapes by default.

      Then in each template file there could be regions where autoescape is explicitly activated/deactivated (that's called tags):

      <!DOCTYPE html>
      <html>
      <body>
      {% autoescape off %}
      <h1>Hello {{ name }}</h1> 
      {% endautoescape %}
      </body>
      </html>
      

      That means that for each variable in template (name in this example) we should know if it's inside such tag.

      Finally there are filters on variables. We are interested in

      • name | safe - disables autoescaping for this variable (doc); safeseq is out of scope of this MMF
      • name | escape - sanitizes the input (doc)

      To sum it up for each variable in project templates we should know if it is escaped. If it is not, it's a potential sink.

      Sinks in Django

      Strictly speaking, Django project is vulnerable when tainted value is part of HttpResponse which is returned in a view function:

      def insecure_hello(request):
          name = request.GET.get("name", "default name")
          return HttpResponse("Hello %s." % name)
      

      But to simplify we consider HttpResponse constructor itself as a sink (so no need to check that it's actually returned).

      More often an html template is used to generate a content for HttpResponse:

      name = request.GET.get("name", "default name")
      template = loader.get_template('xss/insecure_hello_template.html')
      context = { 'name': name }
      response = HttpResponse(template.render(context, request))
      return response
      

      or with a shortcut function render

      name = request.GET.get("name", "default name")
      context = { 'name': name }
      return render(request, 'xss/insecure_hello_template.html', context)
      

      When using template, even if actual problem happens when HttpResponse is returned, we decided it will be a better user experience if primary location of issue will be in html template file.

      So we will generate a UCFG for each template file, with one instruction per each injectable (not escaped) variable. This instruction should be a call with a methodID declared as a sink, e.g. "Django_template_variable", and a variable passed as argument. Note that location of this instruction should be valid, so that issue is nicely reported in SQ UI. Parameters of this UCFG should be all used variables.

       Django Templates: Connecting the Dots

      We have a source inside UCFG for Python function and a sink inside UCFG for html template file, now we should connect them. For that we should insert in the UCFG for the Python function call to the html UCFG when corresponding template is used. This has to be done in the Python frontend. 

      • find call to a function or method "render"
      • find path to a used template file ("xss/insecure_hello_template.html" in our example), we don't need to cover all cases, simple inline string literal should be enough
      • collect list of keys in a "context" variable 
      • replace call to "render" with a call to a UCFG corresponding to a used template file. Then two approaches are possible:
        • labeled arguments, arguments should be expressions accessing "context" variable (map) by label key:  
      return insecure_hello_template_UCFG(name=context.__mapGet("name"), ...)
      
        • single context map variable as an argument, then UCFG for the template should have single parameter and each variable usage should be replaced by "context.__mapGet("name")".
      return insecure_hello_template_UCFG(context)

      Sinks in Flask

      Firstly we should consider every return in the function annotated with @app.route to be sinks. For that in the UCFG we should wrap every returned value in a method call to a methodId declared as a sink. In this case when there is no template used so we want to report at return statement.
      Also in Flask it's common even if a template html file is used, to not use any templating engine and replace values manually.

      html = open('templates/xss_shared.html').read()
      response = make_response(html.replace('{{ name }}', foo ))
      

      In this case there is no autoescaping which can be applied as there is no templating engine used. So we want to consider make_response as a sink and for that we should pass tainted value (of foo in our example) through call to replace. As replace is a sanitizer we can't simply declare it as a passthrough. Instead in the frontend we should find such replace-s (one way is to check that first argument is a literal with curly braces) and put some other methodId in the UCFG which we will declare as a passthrough.

      json.dumps should be declared as a passthrough to detect cases like this. We will get an issue as any return is supposed to be wrapped inside a sink method id call.

      @app.route('/insecure/api/json_no_sanitization', methods =['GET'])
      def json_no_sanitization():
          param = request.args.get('param', 'not set')
          bean = dict(name=param)
          return json.dumps(bean, indent=4)
      

      jsonify and Response with argument mimetype='application/json' should be considered as a sanitizer.

      Finally when using templating engine, we should follow the same approach as for Django. There is only one function which we should cover for Flask: render_template

      @app.route('/insecure/user/update_details_html', methods=['POST'])
      def update_details_html():    
        username = request.form['username']    
        return render_template('template.html', name=username)
      

      Inside template we should cover the same concepts such as regions of tags deactivating/activating autoescaping and filters.

      Parsing of HTML

      We will need to parse html file to be able to retrieve information about templating: variables, tags and filters. Even if these syntaxes are not advanced, directly relying on regex might not be reliable. We might consider writing our small grammar to support these constructs.

        Attachments

          Issue Links

            Activity

              People

              Assignee:
              alexandre.gigleux Alexandre Gigleux
              Reporter:
              alexandre.gigleux Alexandre Gigleux
              Votes:
              0 Vote for this issue
              Watchers:
              1 Start watching this issue

                Dates

                Created:
                Updated:
                Resolved: