Sec, blogmal! - tidbits - ole-compound-file



April '18



Thu, 18 Nov 2010

OLE Compound Format Extractor

Today, a colleague asked me to help him extract a logo as a file from an openoffice document. This is a task which seems easy enough, given that .odt documents are essentially zip files.

Extracting the .odt revealed (among other files) two interesting files: Object 1 and ObjectReplacements/Object 1. Using file to determine the file-types was quite unhelpful - on two different machines I got:

Object 1: Microsoft Office Document
Object 1: CDF V2 Document, corrupt: Cannot read summary info

And the other file stays enigmatic:

ObjectReplacements/Object 1: data

The ObjectReplacements file starts out with

0000000: 5643 4c4d 5446 0100 3100 0000 0000 0000  VCLMTF..1.......

which some googling reveals to be a StarView Meta file. - This is an openoffice internal format, supposed to have the extension .svm and can be opened by OO Draw.

But I wanted to get at the original file. Both suggestions from file(1) are wrong, but the Microsoft Office Document actually points in the right direction…

Checking in META-INF/manifest.xml gives us the supposed mime-type of application/ and further googling shows us that this is an so called OLE Compound File.

Now while I could easily find a Windows program to parse this file, I found no such thing for Unix. – Which lead me to a quick hack using perl and OLE::Storage_Lite to crate cfx the compound file extractor.

ice:~/ole>./cfx Object\ 1
- Root Entry
x \x{01}Ole
x \x{03}PIC
x \x{03}META
x \x{01}CompObj
x \x{03}ObjInfo
x \x{01}Ole10Native
x \x{01}Ole10ItemName

The …Native file is the one we want. For reasons that I still don't understand you still have to delete the first four bytes from that file which in our case then reveals:

ice:~/ole>file $'\001'Ole10Native  
Ole% 0Native: data
ice:~/ole>dd if=$'\001'Ole10Native skip=4 bs=1 of=Fixed
7648+0 records in
7648+0 records out
7648 bytes transferred in 0.025733 secs (297203 bytes/sec)
ice:~/ole>file Fixed
Fixed: PC bitmap, Windows 3.x format, % 97 x 75 x 4

the relevant .bmp file. Yay!

– Sec

P.S.: If you have a stromg stomach, check the file format specification.

P.P.S.: In the meatime I found out that 7-Zip can also extract OLE Compund Files. Would've been a bit easier :-/

posted at: 18:48 | Category: /tidbits | permanent link to this entry | 2 comments (trackback)

Your Comment
URL/Email: [http://... or mailto:you@wherever] (optional)
Title: (optional)
Save my Name and URL/Email for next time
(Note that comments will be rejected unless you enter 42 in the following box: )

powered by blosxom
in 0.00 s