Deliberate Generation Of Documents With Russian Metadata Present

Credit for the original discovery of the anomalies highlighted in this article belongs to u/tvor_22 - The article he wrote about this discovery is: Russia and WikiLeaks: The Case of the Gilded Guccifer

[Note: This has been re-written as the original was not in keeping with this authors present standards. The core facts/assertions/points remain as they were. Editing was solely to purge needless hyperbole that came across as overproving.]

Source Materials (link) (link) (link)

Mirror copies are available below (please use originals above if available):

Host: -> 1.doc 2.doc 3.doc
Host: -> 1.doc 2.doc 3.doc

You may also be able to use your browser to directly see the contents of the files as source code (see instructions below).

Reference Materials

Download Word 2007: Rich Text Format (RTF) Specification, version 9 (Page 36 covers RSIDs)

First thing to know, is that we are dealing with RTF format .doc files. - This is good news for us as it makes it easier for you to interpret than a binary file and means you can inspect the files using a raw text editor (eg. Notepad/Textpad/etc.) - If you have difficulty opening up the files, just change the extension from ".doc" to ".txt".

You might be able to copy and paste the following into your browser's address bar to view the original files as text too:


In all 3 documents, the following text string (a stylesheet definition) exists:

{ \s108\ql \li0\ri0\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0\contextualspace \rtlch\fcs1 \af1\afs20\alang1025 \ltrch\fcs0 \f1\fs20\lang1049\langfe1049\cgrid\langnp1049\langfenp1049 \sbasedon0 \snext108 \slink107 \sqformat \spriority1 \styrsid11758497 No Spacing;}

The fact that we find this in all 3 documents means that they all were based on the same document at some point. (It's the only logical explanation for all documents sharing this RSID (11758497) for starters and it's not the only RSID shared between the documents).

The "lang1049", "langfe1049", etc. parts of the string show that this is set to Russian language. (This may help: Microsoft Locale ID Values)

So we know that all 3 documents were based off an original document that already had a Russian breadcrumb associated with it even before the content in those 3 documents was present. It's probably something worth pondering over at the very least.

To clarify, if these were separate documents that had these specific "Russian-fingerprints" accidentally added while being handled - they would all have different RSIDs. - The only way for what we observe to have happened - is for all 3 files to have been based on a pre-tainted template.