Skip to content

Page size error when parsing sas7bdat file #226

Description

@curtisalexander

Issue

I receive the following error when parsing a rather large sas7bdat file.

Format: SAS data file (SAS7BDAT)
ReadStat: Error parsing page 32767, bytes 0-131071
Error processing .\_rand_ds.sas7bdat: Invalid file, or file has unsupported features

Dataset

The dataset I am using for testing has 3,800,000 rows and 110 columns. Of greater import is that it has 33,195 (i.e. > 32,767) pages if I run a proc contents on the file from within SAS. I can take the same dataset and cut it down so that it has < 32,767 pages and I can parse without issue.

OS

I get the above error only when I run on 64-bit Windows (x86 processor). I built the executable — ReadStat_App.exe — using Visual Studio 19 and the newly added Visual Studio solution for 1.1.5.

If I build on 64-bit Linux (x86 processor), I can parse the file without error. To me this suggests a challenge with macOS / Linux C integer sizes vs. Windows C integer sizes.

Troubleshooting

Note that I'm glad to provide the raw dataset I'm using for testing (it is ~ 4.6GB in size). Alternatively I can provide the SAS program I utilized to generate the dataset if you have access to SAS (the program just creates random data to produce a large file).

Or if I can assist by simply rebuilding and testing from a different commit, I'm glad to do so.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions