email parsing

Email Parsing: Creating Structure from Unstructured Data

4 min read
Tags:

Email in its current form is fundamentally chaotic and difficult (though, not impossible) to extract from the confines of the inbox – which is what makes email parsing so useful. Parsing allows the user to control the keywords they’re searching for, such as attachments, and then parses those data fields from the sender’s details, email footers, and the actual email body. It allows users to go as granular as needed, or to cast a wider net—to collect “august 2017 sales figures” or “sales figures.”

Email Parsing to Extract Data 

Email parsing can be done through API requests that pull specific bits of data from incoming emails – think of it like a search engine scraping the web for specific information. Quality email parsing solutions scrape email content and move it to a designated rules-based location. 

You can parse emails for data like:

  • Email subject lines
  • Email metadata
  • Email send dates/times
  • Email open dates/times
  • Email reply dates/times
  • Email attachments
  • Email signature
  • Email links/hyperlinks

This gives developers a way to move unstructured emails into categorized and structured data. It serves a bridge between email content which may prove valuable, and a database. 

Email Parsing: Code Examples

Parsing allows users to categorize massive volumes of email content down to the level that is useful for the user. For example, a user could search for messages that have attachments that contain important financial records or other accounting data. Armed with the Nylas email API for parsing, emails with attachments can be identified and sent to a processing location. 

Let’s take a look at how to do this.

First, make a GET request to the /threads endpoint to return a list of the most recent threads in the user’s email inbox. For each thread, check the has_attachments attribute to see if any messages in the thread have an attachment

$ curl -X GET 'https://api.nylas.com/threads' \
-H 'Authorization: Bearer ACCESS_TOKEN'
[
    {
        "account_id": "1234***",
        "has_attachments": true,
        "message_ids": [
            "5634***",
            "3456***"
        ],
        "subject": "Our Financial Future",
        "id": "4312****",
         ...
    },
    ...
]

If the has_attachments attribute is true, make a GET request to /messages/{id} endpoint for each message to look for associated files:

$ curl -X GET 'https://api.nylas.com/messages/5634***' \

-H 'Authorization: Bearer ACCESS_TOKEN'
{
    "account_id": "1234***",
    "files": [
        {
            "content_disposition": "inline",
            "content_id": "NsJAw***",
            "content_type": "application/pdf",
            "filename": "my_file.pdf",
            "id": "5fuf8***",
            "size": 2007754
        }
    ],
    "id": "7a8939****",
    "subject": "Our Financial Future",
    "thread_id": "4312****",
    ...
}

If the message contains a file, use the id for that file to download a binary representation of that file with the/files/{id}/download endpoint.

$ curl -X GET 'https://api.nylas.com/files/5fuf8***/download' -H 'Authorization: Bearer ACCESS_TOKEN' > my_file.pdf

Further parsing could then be introduced to narrow down the results based on certain keywords or phrases until the user finds the information they need. 

Email Parsing Applications

Email parsing can help you eliminate odious manual data processes for your users. Across every industry – Real Estate Tech, Customer Service Software, FinTech, HealthTech, and beyond – repetitive tasks require time and human capital. Email parsing eliminates these tedious tasks, allowing email data to be organized and extracted with ease.

Email Parsing for HealthTech Apps

Consider a medical device sales company that collects leads from emails. They might come from various landing pages, partners, or through cold emails. There’s no consistency to the structure of the information contained in the emails. Consolidation of these leads manually is tedious, and the time involved might lower the value of the leads. 

Email Parsing for CRMs

With an email parser, the data is structured into contacts and other categorizations, and can be moved into a CRM platform or other appropriate workflow. Nylas further streamlines this process with its Contacts API that syncs contact information (titles, names, etc.” dynamically, so it is automatically updated. Users can act with confidence that their contacts contain the most up-to-date information. 

Email Parsing for E-Commerce Apps

E-commerce firms that sell goods on various platforms can use email parsing to organize sales together. This can aid accounting or logistics and order tracking and prevent customer care from sorting through emails manually.

Email Parsing for Real Estate Apps

Real estate brokerages can reel in and organize leads from various listing platforms. With email parsing, new leads can be automatically added into the real estate CRM from email data.

Email Parsing with the Nylas APIs

The Nylas API allows developers to build email parsing capabilities for their users across multiple email service providers. Users can flexibly manage and categorize email information in ways that make sense to their roles and tasks. Customize parsing in any configuration and squash repetition – create a developer account to see the Nylas API in action today. 

Related resources

How to schedule time slots and check availability?

Building a scheduling application that displays the organizer’s available timeslots seems like a simple problem…

Nylas’ 2024 predictions: Navigating AI, connectivity, and the future of work

Explore the transformative impact of AI, the evolution of global connectivity, and the reshaping of workplace culture in the digital era in Nylas’ 2024 predictions.

Grouping email threads with Ruby and Nylas

Use the Nylas Email API and Ruby to group email threads into a single view, and easily access complete conversations within your app.