Splunk Field Extractions for Symantec Messaging Gateway A.K.A Brightmail Syslogs

The Symantec Messaging Gateway formally known as Brightmail is a spam filtering appliance, you can read more about it from Symantec here. The appliance appears to run on Linux and it has both a web-interface and a command line interface accessible via SSH. It also has the ability to log system and application level logs via syslog.

The system level logs include processes such as sshd, crond and sudo; the application application level mail logs consist of two processes: ecelerity and bmserver. In this post I focus on the application level logs, those beginning with the <142> prefix. Symantec has some not so helpful documentation on this appliance’s log formats here: https://support.symantec.com/en_US/article.HOWTO15282.html

From what I see in Splunk the logs are in the format: <identifier>date time server-name process[process-number]: process-id|message-id|event|variable-log-format. There appears to be 18 different application level log events all with a different format. Those events are: IRCPTACTION, ACCEPT, VERDICT, TRACKERID, UNTESTED, FIRED, SENDER, LOGICAL_IP, EHLO, MSG_SIZE, MSGID, SOURCE, SUBJECT, ORCPTS, DELIVER, ATTACH, UNSCANNABLE and VIRUS. These different formats make it impossible to use Splunk’s built in field extraction interface. An alternative solution is to write a custom regex extraction.

Lacking complete documentation I had to reverse engineer a regex extraction from the logs being sent to my Splunk server. With this in mind be warned that my final regex extraction may contain errors. Also be aware that the fields in the extraction are names that I assigned and are not the official field names as I could not find complete documentation on this system’s log format.

I used Regex101.com to help me craft and test my regular expression extraction. You can view my saved regex and test string with anonymized syslog entries at: https://regex101.com/r/kR0iS8/1. If you visit this link, and are familiar with regular expressions, you will notice that I used multiple positive look-behinds. This is the best solution I could come up with to deal with the variable log formats produced by the SMG. My skill with regular expressions is intermediate at best, so there may very well be better solutions out there. If someone with more knowledge of regular expressions reads this article and cares to correct me, feel free to leave a comment.

Below is the final regular expression that I came up with. This seems to work for the majority of the logs that Splunk processes from Symantec Messaging Gateway, but may need further tweaking.

^<142>(?P<date>\w+\s+\d+)\s+(?P<time>[^ ]+)\s+(?P<server>\w+)\s+(?P<process_name>[a-z]+)\[(?P<process_number>\d+)[^ \n]* (?P<process_id>[^\|]+)\|(?P<message_id>[^\|]+)\|(?P<action>IRCPTACTION|VERDICT|UNTESTED|FIRED|SENDER|LOGICAL_IP|EHLO|MSG_SIZE|MSGID|SOURCE|SUBJECT|ORCPTS|TRACKERID|ATTACH|UNSCANNABLE|VIRUS|DELIVER|ACCEPT)(?:(?:(?<=ACCEPT|DELIVER|LOGICAL_IP)\|(?P<src>[^:\s]+)(?::(?P<port>[0-9]+))?(?:\|(?P<to>[^\s]+))?)|(?:(?<=FIRED|IRCPTACTION|ORCPTS|TRACKERID|UNTESTED|VERDICT)\|(?P<recipient>[^\s\|]+)(?:\|)?(?P<result>[a-z][^\|\s]+)?(?:\|(?P<result_2>[a-z][^\|]+))?(?:\|(?P<result_3>.+))?)|(?:(?<=SENDER)\|(?P<from>[^\s]+))|(?:(?<=MSG_SIZE)\|(?P<msg_size>\w+))|(?:(?<=SUBJECT)\|(?P<subject>.*))|(?:(?<=ATTACH)\|(?P<attachment>.+))|(?:(?<=UNSCANNABLE)\|(?P<reason>.+))|(?:(?<=VIRUS)\|(?P<virus_name>.+))|(?:(?<=EHLO)\|(?P<fqdn>.+)))?

If readers have any questions of comments about this extraction, feel free to leave a comment and I will try to respond in a timely manner.

Preventing Clickjacking Using Content Security Policy

I recently came across an issue with a legacy system which allowed its login page to be framed by any site. This is a problem because that login page needs to be framed by other pages that reside on other subdomains of the same site.

Framing login pages is a bad practice in general. In most cases I would opt to just rewrite a site’s code and remove the need to frame login pages. However in this specific case rewriting the code for this system would have taken an extended amount of time. Further more the whole system is scheduled to be replaced within the next year. With these factors in mind, I went searching for a quick way to block unauthorized framing of the system’s login pages.

With the need to allow framing from multiple subdomains, the old best practice of using the X-Frame option of SAMEORIGIN won’t work. I did a little research and found the ALLOW-FROM option on Mozilla’s Developer Network. Some browsers such as FireFox and Internet Explorer will support the ALLOW-FROM option. This allows content to be shown from a given URL, including the use of wildcards. However in the MDN documentation it states that the ALLOW-FROM option not supported in Chrome or Safari.

After doing a bit of searching, I found comments on multiple bug trackers (1, 2, 3) that suggests Chrome and other WebKit browsers will never support this feature. Instead they support the Content Security Policy frame-ancestors options.

If you are unfamiliar with Content Security Policy see this link. The tl;dr is that Content Security Policy allows sites to restrict where content such as JavaScript, images, fonts, etc are loaded from and how they can be used within the page. In this post we will look specifically at the frame-ancestors option.

Implementing the frame-ancestors option is very simple and consists of adding a single header. The system I mentioned at the beginning of this post runs on Apache. So implementing this was as simple as adding the line below to the .htaccess file in the login page’s directory.

Header set Content-Security-Policy "frame-ancestors https://*.[appdomain].com;"

That single line will only allow subdomains of the app’s domain to frame the login page. All other pages will display a Content Security Policy violation message or a blank page depending on the browser. This assumes that the page is viewed with a browser that supports the frame-ancestors Content Security Policy option.

Before implementing this on the legacy system mentioned; I created a few test pages to see how this worked. I created a test login page at http://alec.dhuse.com/cs/csp/login.html. If you visit this page you will notice that there is a message regarding an externally loaded jQuery library. I included this as some Content Security Policies will block external JavaScript libraries and I wanted to make sure any changes I made still allowed these libraries to be loaded.

I also created a .htaccess file with a single entry to block framing on any site other then alec.dhuse.com. That line is:

Header set Content-Security-Policy "frame-ancestors http://alec.dhuse.com https://alec.dhuse.com;"

Next I created a page that frames that login page and includes an example of a div overlay that could be used to grab a users credentials. I then hosted that file on two different domains, one on the same subdomain here: http://alec.dhuse.com/cs/csp/framed.html and on the parent domain here: http://dhuse.com/cs/csp/framed.html.

If you visit the two links, you will notice that the first link hosted on the same subdomain will allow framing. You can also click the “Show login overlay” checkbox to see new text fields and login button appear over the framed page. These overlaid controls allow user entered credentials to be intercepted and then passed on to the original system. For the second link, the framed login page will not load and may even show an error message.

Last thoughts:

If you examine the code on the login page you will notice a meta tag attempting to implement the same policy supplied by the Apache headers. This was an attempt to implement the same policy without out the need to send an extra header. However, it appears that while many Content Security Polices can be implemented by this meta tag, frame-ancestors cannot be. I could not find a clear reason as to why, other than that it’s not in the spec.

Splunk HTTP Event Collector Python 3 Example

With Splunk’s latest release of version 6.3 a new feature called HTTP Event Collector has been added. It allows for sending JSON formatted data to Splunk via an HTTP call. I won’t go into all the details of this feature in this post, but for the curious more information can be found here.

This feature is great for anyone who wants to easily get data into Splunk using their own scripts. With this being a new feature there is not yet many examples of how to use this on the scripting side. In this post I want to provide an example in Python that others can use to build upon in their own code.

Below is a short and documented example using the urllib library to craft an HTTP request that Splunk’s HTTP Event Collector will accept.

import urllib.request
import json

def send_event(splunk_host, auth_token, log_data):
   """Sends an event to the HTTP Event collector of a Splunk Instance"""
   
   try:
      # Integer value representing epoch time format
      event_time = 0
      
      # String representing the host name or IP
      host_id = "localhost"
      
      # String representing the Splunk sourcetype, see:
      # docs.splunk.com/Documentation/Splunk/6.3.2/Data/Listofpretrainedsourcetypes
      source_type = "access_combined"
      
      # Create request URL
      request_url = "http://%s:8088/services/collector" % splunk_host
      
      post_data = {
         "time": event_time, 
         "host": host_id,
         "sourcetype": source_type,
         "event": log_data
      }
      
      # Encode data in JSON utf-8 format
      data = json.dumps(post_data).encode('utf8')
      
      # Create auth header
      auth_header = "Splunk %s" % auth_token
      headers = {'Authorization' : auth_header}
      
      # Create request
      req = urllib.request.Request(request_url, data, headers)
      response = urllib.request.urlopen(req)
      
      # read response, should be in JSON format
      read_response = response.read()
      
      try:
         response_json = json.loads(str(read_response)[2:-1])
         
         if "text" in response_json:
            if response_json["text"] == "Success":
               post_success = True
            else:
               post_success = False
      except:
         post_success = False
      
      if post_success == True:
         # Event was recieved successfully
         print ("Event was recieved successfully")
      else:
         # Event returned an error
         print ("Error sending request.")
      
   except Exception as err:
      # Network or connection error
      post_success = False
      print ("Error sending request")
      print (str(err))

   return post_success

def main():
   splunk_auth_token = "00000000-0000-0000-0000-000000000000"
   splunk_host = "10.11.12.13"
   
   log_data = {
      "data_point_1": 50,
      "data_point_2": 20,
   }
   
   result = send_event(splunk_host, splunk_auth_token, log_data)
   print (result)

main()

A few things to note: this example is not using SSL, so the Enable SSL check box in the HTTP Event Collector global settings must be unchecked. Also Splunk is picky about the top level JSON keys, only a few specific keys can be used. Those keys are: time, host, source, sourcetype, index and event. All custom data should be under the event key. Finally this code should work in all versions of Python after 3.0.

Detecting Abuse on WordPress Sites

This blog runs on WordPress. It is pretty obvious looking at my basic layout and theme, if not the “Proudly powered by WordPress” stamp at the bottom of this page. WordPress is great, it is easy to setup, easy to manage, write posts for and add plugins. This ease of use attracts a lot of users that tend not to know much of how websites work or web security in general.

Due to this lack of knowledge by some of its users, many WordPress instances are vulnerable to attacks. These attacks may target poorly written plugins, weak passwords, outdated versions of WordPress core or its plugins. Most vulnerabilities for the these targets are widely known by attackers and those attackers write scripts to look for vulnerabilities. Attackers can build up lists of WordPress sites using carefully crafted search engine queries. Site lists combined with automated scrips looking for known vulnerabilities means that attackers can search a large number of sites in a relatively short amount of time.

As a result most WordPress blogs are tested by attackers for vulnerabilities at some point. These tests are usually not very discrete and can be detected in the server logs. I am a curious individual so I look at my logs quite a bit and I see a number of these vulnerability tests. When I first noticed these log entries, I ignored them. When I kept seeing them in the logs I noticed that most of these attacks tests conformed to patterns. This made me wonder if I could detect these attack tests automatically. It turns out for many of them I can.

I accomplished this automation primarily by creating custom 404 and 400 error documents and directing my .htaccess file to use them. In these files I capture the remote IP, request URL and the user agent. In the 404 error handler I examine the request URL for strings that I’ve observed over and over again in my Apache logs; URL requests for files that do not exist in my WordPress instance. These requests are usually for specific files used by plugins that I do not have installed. When I find these request I log the remote IP, time and a short description.

I also examine the user agent strings looking for strings generated by common code libraries like python, pearl, wget, curl and others. These are from scrips written to look for specific files, may of which are testing for vulnerabilities. If I see one of these user agent strings on a non-existent file, I log it. I also added this code to the head of my wp-login.php file, denying access for script user agents and logging any attempts to login. This catches a fair number of scripts trying to bruit force my WordPress login.

Finally in the 400 error document, I again look for specific strings that I’ve found in my logs, logging any matches. All this code results in a list of IPs with descriptions and timestamps. As of writing this post I have not done much with the resulting list so far, but I figured it might be of use to other site admins out there. So for anyone interested in my list of detected malicious IPs it can be viewed at: http://scarletshark.com/intel-lists/v1/mal-ips.php

The link above provides a CSV file of malicious IPs detected in the last 24 hours. It is also possible to specify the time frame and format. Right now the script supports both csv and json formats and can provide results for a time frame of the last 1 to 336 hours (2 weeks).

Example JSON format for malicious IPs in the last 36 hours:

http://scarletshark.com/intel-lists/v1/mal-ips.php?format=json&hours=36

Example CSV format for malicious IPs in the last 12 hours:

http://scarletshark.com/intel-lists/v1/mal-ips.php?format=csv&hours=12

All timestamps are in USA Central time. I plan on updating malicious IP detection and this output script in the future, but plan on using versioning via the v* directory. Feel free to use this list in aggregated block lists for private or commercial use.

 

Photo Upload and Preview with Paper.js

With the rise of social media and collaborative sites, users are sharing more of their own content than ever. While users upload many different types of content to these sites, for this post I am focusing on photos.

Chances are if you code for the web you have or will have to write code to allow users to upload their photos. In the old days you could get away with a simple file selection and upload. However now days’ users expect to preview their images and to be able to use a few simple editing features on the photo before uploading it. In this post I write about photo rotation and cropping, but depending on the context they may expect more. The more might include tools like photo filters or user tagging.

Before starting this project, I Googled a bit to see what was already out there. While I found quite a few examples and libraries to show an image preview, none of those allowed any editing of the photo before upload. I feel that giving the user some basic editing ability is important, and what’s more I think users expect to have that ability on modern websites. So with that idea in mind I began creating a photo upload and preview widget with these basic tools.

I didn’t want to rewrite a graphics library as plenty of those already exist, but I would have to choose one for this project. I’ve used Paper.js for other projects in the past and my familiarity with it is the sole reason I chose it for this project. I’m sure any other similar library would work just as well, especially since this project does not call for any complex graphical work. That being said, Paper.js also has other raster features that would make implementing future improvements like photo filters an easy task.

For this project I started by building the HTML and CSS to structure the layout. The photo tool icons are SVG images that can be styled by CSS. For photo section I used a standard file input. The change event on this input tigers a custom function that adds the image to the Paper.js canvas and activates the edit controls. For the photo itself I didn’t want to affect its resolution by resizing the image, so when the image is added I use the view zoom to fit the photo in the viewport. This preserves the original photo size and resolution.

Rotation was easy as there was built in functions to do this. The hard part was dealing with cropping the image after rotation. Paper.js does not return image dimensions correctly when that image is rotated. The position and dimensions are those of the non-rotated image. To compensate I had to write code to rotate the given position and dimensions.

After the rotation issue solved, I then had to extract the image as a dataURL and craft an AJAX request. That finished up the client side part of the code. As a proof of concept I created a quick php receiving file. It takes the given dataURL and creates an image file on the server. It’s just a skeleton but could be easily adapted to most projects.

I’ve created a repository on GitHub for this project for interested readers. Also check out the client side code on this CodePen:

See the Pen Upload Image and Preview by Alec Dhuse (@alecdhuse) on CodePen.