X12 837 and 835 Healthcare Claim Files

Format:X12 837/835
File Extension:  (no extension specific to format)
Usage:Electronic submission of healthcare claim and payment information
Standard:ANSI ASC X12 837/835 (www.x12.org)
Open format?  You must purchase the standards document
Complexity:Readable text, but complex structure

About the X12 837 and 835 file Formats

The X12 837 and 835 files are industry standard files used for the electronic submission of healthcare claim and payment information.

The 837 files contain claim information and are sent by healthcare providers (doctors, hospitals, etc) to payors (health insurance companies). A single 837 file may contain multiple claims with information such as the patient's condition for which treatment was provided, the services provided, and the cost of the treatment.

The 835 files contain payment (remittance) information and are sent by the payors to the providers to provide information about the healthcare services being paid for. Because healthcare companies often adjust the claims based on their own rules, 835 files often do not match up one for one with the corresponding 837s. 835 files contain such information as what charges were paid/reduced/denied, deductable/co-insurance/co-pay amounts, bundling and splitting of claims, and how the payment was made.

Both 837 and 835 files contain patient confidential information protected by the Health Insurance Portability and Accountability Act (HIPAA) and so must be handled with care. Certain security rules must be followed when working with these files.

Reading 837 and 835 files

837 and 835 files are plain text files and can be opened in a standard text editor, but unlike many other text files these are difficult to make sense of due to their cryptic structure. Below is an example of part of an 835 file:

ISA*00*          *00*          *ZZ*UPPLAND         *ZZ*99999999       *040315*1005*U*00401*004075123*0*P*:~GS*HP*ABCCOM*01017*20110315*1005*1*X*004010X091A1~ST*835*07504123~BPR*H*5.75*C*NON************20110315~TRN*1*A04B001017.07504*1346000128~DTM*405*20110308~N1*PR*ASHTABULA COUNTY ADAMH BD*XX*6457839886~N3*2817 DERBY ROAD~N4*WASHLAND*TX*78300~N1*PE*NORTH MEDICAL CENTER *FI*346608640~N3*38061 DALE AVE~N4*WASHLAND*TX*44004

Opening 837s and 835s in a text editor

When you open these files in a text editor you may find that the entire file is a single line of text. This is because the segment delimiter often does not include a carriage return. If this is the case, the first thing you should do is replace all segment delimiters with a carriage return. In the example above the segment delimiter is a tilde ( ~ ). Adding carriage returns puts each segment on a separate line allowing you to scroll through the file more easily, but it still doesn't make the file understandable. To understand what you are looking at you will need an 837 or 835 companion guide that describes each segment and the data elements to expect within them. Now you can search through the file, find data that you are interested in, and read the segments around that data using the guide. The problem is that sequences of segments may repeat over and over again making it hard to relate repeated detail data (such as treatments and cost) to information higher in the hierarchy (such as payor or provider information). The repeating and hierarchal nature of the data can be very difficult to see when looking at the file as text.

Using a 837/835 Viewer Tool

Specialized tools are available that can open and interpret 837 and 835 files. These tools are typically sold commercially but some have trial versions available for download. Unfortunately, for us Mac users we could find no native Mac 837/835 tools available (let us know if you find one). Despite the cost and installation required, these tools will get you much further than looking at the files in a text editor since they can interpret and display the data in an understandable way. However, if you need to go beyond just viewing the files and some basic tallying of the amounts you may still need a more powerful solution.

Importing the data to SQL

Importing 837/835 files to a database is a complex problem which requires mapping the hierarchical structure of the file to table records, keeping track of the loops in the file and mapping fields in the segments to table columns. Some commercial tools can do this or at least export to delimited text files that can be loaded into a database. Simply parsing the segments into a database table by looking for the segment delimiter doesn't allow you to query the data in any meaningful way and so is almost useless. APIs are available (in Java, C# and etc) for parsing these files, but using these requires a fair amount of code to map the parsed results into sql tables. It can be done, but in our experience, it takes quite a bit of work.

Specifics of the 837 and 835 File Formats

The 837 and 835 formats conform to the X12 electronic data interchange (EDI) specification. X12 EDI files consist of a sequence of segments that are terminated by a delimiter such as a tilde ( ~ ). If we break our example above into one segment per line we get this:

ISA*00*          *00*          *ZZ*UPPLAND         *ZZ*99999999       *040315*1005*U*00401*004075123*0*P*:~
GS*HP*ABCCOM*01017*20110315*1005*1*X*004010X091A1~
ST*835*07504123~
BPR*H*5.75*C*NON************20110315~
TRN*1*A04B001017.07504*1346000128~
DTM*405*20110308~
N1*PR*ASHTABULA COUNTY ADAMH BD*XX*6457839886~
N3*2817 DERBY ROAD*SUITE 130~
N4*WASHLAND*TX*78300~
N1*PE*NORTH MEDICAL CENTER *FI*346608640~
N3*38061 DALE AVE~
N4*WASHLAND*TX*44004

Each segment begins with a code that specifies the type of segment and is followed by zero or more data elements that contain data values. In the above, the first segment is an ISA segment followed by a GS segment and so on. Data elements within a segment are separated by a another delimiter, in this case an asterix ( * ). In order to read a segment you have to break out the data elements and look at the definition of that segment in a guide that tells you what the each data element means. For example the N3 segment is a street address and has the following elements in order:

  • Segment code (N3)
  • Street Address 1
  • Street Address 2

So we can now read the first N3 segment in the example as an address:

N3*2817 DERBY ROAD*SUITE 130~

The problem is that the N3 segment code does not tell you whose address this is. To determine that you have to look at the order in which the segments occur in the file. For an 835 file the first N3 segment is the Payor Address. For an 837 file the first N3 is the Billing Provider Address.

Some segments such as the BPR segment, which specifies financial information such as the total payment amount for the file, don't depend on where they are in the file for their meaning. Other segments such as the N3 have meaning that depends on what segments it follows. And yet other segments such as the N1 contain a code in the second element that tells you what it means. In the example above the first N1, which is a Name segment, has a code of PR which means Payer Name.

N1*PR*ASHTABULA COUNTY ADAMH BD*XX*6457839886~

Both 837 and 835 files contain groups of repeating segments called loops. These loops allow the file to contain information about more than one insurance claim, more than one patient, more than one provider, and etc. Loops can be spotted in the files by looking for repeating sequences of segments. However, it can be hard to do this manually since many segments are optional or can themselves repeat. In a 837 file, for example, each claim begins with a segment code "CLM" which indicates the start of the claim loop. If you split the file by "CLM" segment, you will get a set of claims. The claim loop is fairly easy to spot since it begins with a unique segment code "CLM". But not all loops begin with a unique segment code. The attending physician loop begins with the fairly common "NM1" name segment which could be any name (patient name, provider name, etc). In order to detect an attending physician loop you have to look for the type of name which has a code of "82" for an attending physician. In that case NM1*82 begins the loop. There are a few other variations of how loops can be detected such as cases where optional segments can begin a loop. Also loops can be nested inside of loops. In the 837, the claim loop is inside of a provider loop. There can be more than one insurance provider and each provider can have more than one claim. In some documentation you will see the nested loops represented as a hierachy with each loop given its code like this:

    • Interchange Control
      • Functional Group
        • Transaction
          • Submitter 1000A
          • Reciever 1000B
          • Billing / Pay to Provider 2000A
            • Billing Provider 2010AA
            • Pay to Provider 2000AB
            • Subscriber Payer 2000B
              • Subscriber 2010BA
              • Payer 2010BB
              • Patient 2000C
            • Claim 2300
            • ...
As you can see, the 837 and 835 formats aren't the easiest to deal with. If you are doing more than searching for something specific within the file, then it is probably best to find a tool. We tried out a more programmatic approach using one of the EDI parsing java libraries which worked pretty well, but took quite some time to get all the mappings and loops correct.

Getting the 837/835 File Format Spec

The file format specification can be purchased from the ANSI ASC X12 organization (www.x12.org) as well as from several other sites. Work on the standard is done under the Insurance Subcommitte (N) of the Health Care Task Group (TG2) and in Work group 2 (WG2) of the X12 organization. Downloadable companion guides for the file format can be also found on-line by doing some creative web searching.


About Sapling Data

We help our clients analyze and gain insights from many different kinds of data. Sapling provides a web-based data analytics environment that clients can access from their browser and where clients can collaborate with experienced Sapling analysts to explore, query and visualize data. The X12 837 and 835 formats are just one of the many types of data that we have brought in for analysis.