Possible to capture a program's database decryption algo?

Coldblackice · May 23, 2012

A friend asked me to see if I can extract a database from a CD of realtor listings. The CD installs a program that gives full access to the database, where you can view all of the information, but only within the confines of its own (small) program window. There are no options to resize the window, resize columns, or export the data.

I found the .db file on the CD, and its file header says it's SQLite. I was able to dump the file with SQLite, but the contents are obviously encrypted. It shows the name columns in clear text, but the business name columns are definitely encrypted.

Would it be possible to grab/copy the decryption key/process by watching the program's .exe in a disassembler, like IDA Pro, and finding it that way? If that wouldn't work (or be feasible), what about accessing the .exe's memory space (once it's decrypted the data and showing it in its own program window) -- Would it be possible to access its memory space and extract the (now decrypted in memory) data that way?

Any insight on undertaking this process?

dashpuppy · May 23, 2012

it's encrypted for a reason, just leave it alone.

jbrukardt · May 23, 2012

you're going to want to use something more like softICE and watch the program sequence as it "sends" the key to the dB.

edit: sounds like its not sqlite encryption since you can open the file and read some columns. In fact, its likely not encryption at all, but rather just encoding of the columns.

Coldblackice · May 23, 2012

I seem to have correlated two separate fields in the database dump with two separate fields in the plaintext data (within the program):

'luJY5SkN77FcSupwiWORbg==',
'luJY5SkN77EDFcpEGi7KzA==',

...corresponds in location and similarity within the same/correlated entry to the two phone numbers...

(XXX) XXX-0044
(XXX) XXX-0066

...with the two phone numbers being near exactly the same and the two column fields being near exactly the same.

Any insight into what this "means"? I would think that if it's some type of 'hard' encryption, that there's no way that the two phone numbers would look similar in 'encrypted' form (if they are encrypted after all).

Any idea how I could figure out what method/encryption was used to scramble these, so that I could unscramble them?

If not, and since I have the plaintext phone numbers, is it possible to backward-decrypt/reverse-engineer the encryption process, by having the end result in hand (and assuming these haven't been salted, as they don't appear to)?

Deads0uls · May 24, 2012

jbrukardt said:
you're going to want to use something more like softICE and watch the program sequence as it "sends" the key to the dB.

edit: sounds like its not sqlite encryption since you can open the file and read some columns. In fact, its likely not encryption at all, but rather just encoding of the columns.

I think he is right, it's probably an encoder like MD5 or SHA

A real encryption algorithm would not have any patterns like the one you mentioned with the phone numbers.

robvas · May 24, 2012

You tried looking at the file swith SQLite browser right?

http://sqlitebrowser.sourceforge.net/

Deads0uls said:
I think he is right, it's probably an encoder like MD5 or SHA

A real encryption algorithm would not have any patterns like the one you mentioned with the phone numbers.

A hashing algorithim like MD5 wouldn't either.

Sounds like an interesting project, maybe try posting somewhere to Reddit and see if someon can figure it out? How big is the entire DB file?

aL Mac · May 24, 2012

Coldblackice said:
I seem to have correlated two separate fields in the database dump with two separate fields in the plaintext data (within the program):

'luJY5SkN77FcSupwiWORbg==',
'luJY5SkN77EDFcpEGi7KzA==',

...corresponds in location and similarity within the same/correlated entry to the two phone numbers...

(212) 599-0044
(212) 599-0066

...with the two phone numbers being near exactly the same and the two column fields being near exactly the same.

Any insight into what this "means"? I would think that if it's some type of 'hard' encryption, that there's no way that the two phone numbers would look similar in 'encrypted' form (if they are encrypted after all).

Any idea how I could figure out what method/encryption was used to scramble these, so that I could unscramble them?

If not, and since I have the plaintext phone numbers, is it possible to backward-decrypt/reverse-engineer the encryption process, by having the end result in hand (and assuming these haven't been salted, as they don't appear to)?

That's very interesting, something that was really encrypted wouldn't have easy patterns like that.

When I was in high school I was a bit mischievous. There was a teacher that was always asking me to fix his computer and threatening to flunk me if I didn't. Once I was in his computer I copied his grade book off on to a disk and brought it home. This was a DOS based grade book with blue screen background, pretty old, but when I tried to access the class's grades it requested a password. I was able to use a hex editor to look through the program and find some sort of back-door that was in the program's code. Typing in a special keyword as the password would give me an "invalid password" message but followed by a sequence of seemingly random numbers. Interesting.

What I did to crack the "code" was create numerous new classes and give them a password of my choosing. e.g. "aaaaa", "bbbbb", "ccccc". I then used the backdoor string and recorded the numbers I got until I discovered the pattern. I found out the numbers represented the password in backwards order, from right to left as the ascii code of the character + the previous character + some constant mod 256. I have no idea why it would have been implemented this way other than perhaps to help teachers who forgot their passwords. It looked to be a built in back door. Anyway, I quickly discovered his password was "Libby".

If it's not a hash or encrypted, and some sort of ridiculous scheme like this you can do this, I believe it's called a "known plaintext assault" or something.

That being said first make sure it's not just Base64 / Hex encoded binary. that's what it looks like to me =)

Coldblackice · May 24, 2012

The entire DB is 25MB.

After converting the database from SQLite 2.1 to 3, I was able to look at it in SQLitebrowser. Here's an image from its output:

bit. ly/MLe9Mo
http:///MLe9Mo
As you can see, only some of the columns' data are encrypted, not the entire database.

I think I've discovered how these columns are partially encrypted -- Base64. Many of the entries end with "=" or "==", and according to Base64's wiki article, that's exactly how it rolls.
---
Padding

The '==' sequence indicates that the last group contained only 1 byte, and '=' indicates that it contained 2 bytes. The example below illustrates how truncating the input of the whole of the above quote changes the output padding:
---

But my attempts at reversing the Base64 lead to scrambled data, so the data was obviously encrypted with something beforehand.

After probing around the .exe with a hex editor and looking at the PE headers and layout, I find that it was created with AutoPlay Media Studio. Google searches hint that this studio is known to encrypt with blowfish and then Base64 the output. I've also found strings in the executable mentioning "blowfish" and "Base64".

Here are signatures found in the .exe. There are a handful of blowfish ones -- I'm not sure exactly what they mean, as far as there being multiple tables. Anyone have any insight?

bit. ly/KDDHos

Good idea on Reddit, I'll also post there.

aL Mac · May 25, 2012

Looks like I was right about Base64. Base64 isn't encryption, it is just a method of taking binary data (from encryption or otherwise) and encoding it into text format, for example if you wish to include binary in an XML file.

However, if it is actually encrypted before the base64 encoding then it is going to be harder to decipher. Your best bet would be to look for the encryption key (string) inside the application using a hex editor perhaps. If the program can encrypt and decrypt the data without any external dependencies then you may think that it must contain the key to do so. If the programmer didn't try to hide it, it could be just in plain text as a constant in the program.

Do you know what language the application was written in? If it was written in .NET for example, you can likely use a tool such as Reflector to disassemble the application back to source code unless it has been obfuscated.

robvas · May 25, 2012

I still think if it was encrypted or even hashed, you wouldn't see similarities like you do between two close plaintexts. I could be wrong. Maybe they are just applying the base64 to a different format of the phone number?

Coldblackice · May 25, 2012

It was "written" in Autoplay Media Studio:

...

Ruoh · May 25, 2012

Floating a little close to a T17, S1201 violation here. I'm pretty sure you need to contact the publisher of the CD and get permission to decrypt their work.

Coldblackice · May 25, 2012

I have their permission, thanks.

jiminator · May 25, 2012

then ask them for the key

Coldblackice · May 25, 2012

(The company is no more. This database now belongs to another real estate company, that doesn't have the ability (or key) to extract all of the information out of it, besides viewing it through the program.)

It was "written" in Autoplay Media Studio:

http://www.indigorose.com/products/a...-media-studio/

Looking through their documentation, they have a plugin/module that does all encryption/decryption in their software (first Blowfish, then Base64'ing it). The specific function/library is Crypto.BlowfishDecryptString():

http://www.indigorose.com/webhelp/tu...ryptString.htm

The program's main executable, autorun.exe, has this plugin module inside of it. This .exe uses an external file, SQLite.lmd, to access the database file. Autorun.exe does the actual decrypting/encrypting using the function above (Crypto.BlowfishDecryptString())

Now here's what I've found --

Alternating between Ida Pro and Ollydbg, I looked up all the referenced strings. I found the Crypto library and its subsequent functions in ASCII in the autorun's .data section. But I don't know how to breakpoint the access of that in the disassemblers.

So I downloaded CheatEngine, which has the ability to access the running program's memory space. I combed through autorun's memory, and found the string "Crypto" in one spot, and "BlowfishDecryptString" in another spot (with the other Crypto library functions right around this same spot in memory).

When I change one letter of "Crypto", the running program can no longer decrypt a database entry, and gives an error. Alternatively, when I change one letter of "BlowfishDecryptString", it can no longer decrypt an entry either. When I change these letters back, the program can again decrypt.

So I believe that I've found the pieces in memory that the program uses to decrypt the database entries. I realize this sounds program's security probably sounds horrendously "juvenile", but I don't think Autoplay Media Studio is any kind of pinnacle of supremely secure protection. I'm sure the decryption key is plaintext somewhere.

Now where I'm stuck... I need to find the last piece of the decryption function (as defined in the API) -- the key.

How can I find what is accessing the "Crypto" and "BlowfishDecryptString" functions in memory? How would I go about finding the third piece to those being grabbed, the key?

I've set a "Breakpoint on Access" in CheatEngine for that spot in memory, but it shows a "xor eax,eax" accessing that spot in memory, which doesn't seem right (how would xor access another spot in memory?).

But watching the program from/around that breakpoint, stepping through it, it's definitely on the path to the program decrypting the data. So I looked at the hex bytes of that xor function and found its location in Ida Pro's analysis of the program...

But now I'm not sure where to head. Any ideas/pointers?

robvas · May 25, 2012

If you search for 'ollydbg cracking tutorial' you'll get a million hits and even Youtube videos.

It's similar to finding the function that verifies a CD key or serial #. Sounds like a fun project

robvas · May 25, 2012

You should be able to put together a script in Perl/Ruby/whatever and if you get the string they pass to that blowfish function (before they base64 it (so they can store it as a text field in the database) use that as the decryption key.

http://crypt.rubyforge.org/blowfish.html

jiminator · May 25, 2012

there are programs that strip the text strings from exes. you should be able to run one then search through the resulting text for what might be passwords.

Coldblackice · May 26, 2012

Ahhhhh, believe it or not, I got it. Now on to transcoding the database.

Thanks for the help, guys.

Possible to capture a program's database decryption algo?

Coldblackice

[H]ard|Gawd

dashpuppy

Supreme [H]ardness

jbrukardt

[H]ard|Gawd

Coldblackice

[H]ard|Gawd

Deads0uls

Limp Gawd

robvas

Gawd

aL Mac

Gawd

Coldblackice

[H]ard|Gawd

aL Mac

Gawd

robvas

Gawd

Coldblackice

[H]ard|Gawd

Ruoh

Supreme [H]ardness

Coldblackice

[H]ard|Gawd

jiminator

[H]F Junkie

Coldblackice

[H]ard|Gawd

robvas

Gawd

robvas

Gawd

jiminator

[H]F Junkie

Coldblackice

[H]ard|Gawd