Forensic cheatsheet ( File Analysis )

File analysis

Glossary

Magic Bytes = beginning of the file

End Bytes = end of the file

Metadata

File Type

file <file>

Searching within files with grep / findstr

Grep

grep <option> <regex> <folder/file> | <filter>
   
#    -e PATTERN, -E PATTERN
#    -i --ignore-case : ignore case sensitivity
#    -R -r, --recursive : recursive search within the directory
#    -l --files-with-matches : display the file name, not the matching text
#    -v --invert-match : select data that does not match the pattern
#    -n  --line-number : display the line number
#    -o --only-matching : display only the matching portion
#    -A : number of lines to display after the match
#    -B : number of lines to display before the match

~~Recursive search | output file paths~~

grep -Ronia -E "<regex>" -a

~~Recursive search | output match~~

grep -Rnia -E "<regex>" ./folder/

~~Recursive search | output match | unique ~~

grep -Rnia -E "<regex>" ./folder/  | sort -u

Findstr

findstr /R /S "<regex>" .\*.*

Links

[a-ZA-Z]{2,5}://[^]\"\<\>\^\`\{\|\}]*

http & https

(http|https)://[^]\"\<\>\^\`\{\|\}]*

Mails

[a-zA-Z0–9._%+-]+@[a-zA-Z0–9.-]+\.[a-zA-Z]{2,10}

~~Domain extraction~~

grep -Roni -E "[a-zA-Z0–9._%+-]+@[a-zA-Z0–9.-]+\.[a-zA-Z]{2,10}" ./folder/ | sort -u | grep -oi -E "@[a-zA-Z0–9.-]+\.[a-zA-Z]{2,10}" | sort -u

IP Addresses

IPv4 :

\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b

Secrets

AWS / Google / JWT / SSH keys / API key others secrets keys :

AKIA|AIza|eyJ|PRIVATE\sKEY|API[\s\-]KEY|ghp_|gitlab_|bitbucket_|xox[baprs]|sk_live_|sk_test_|api_key|secret_key|access_token|auth_token|password|private_key|client_secret|jwt_token|db_password|api_secret|encryption_key|app_secret|oauth_token|ssh_key|master_key|session_token|auth_key|service_account_key|refresh_token|service_account|(postgres|mysql)://

Linux Exploitation

Search for traces of exploitation in command history (Linux System) :

nmap|nc|netcat|bash\s-i|/dev/tcp|/dev/null|\bid\b|curl|wget|ping|nslookup|rev|shell|grep.*ssh|whoami|uname|history|find|chmod|passwd|kill|rm.*/tmp|/var/|cron|periodic|background|launchctl|\.plist|LaunchAgents|LaunchDaemons|\.profile|\.zprofile|sftp|ssh|scp|driver|cd\s/\b|\|\s*bash|\|\s*sh

Other text search

grep -Rnil "text-to-find-here" ./folder/

grep -Rnil "text-to-find-here" -A 3 -B 3 ./folder/

Searching in Binary Files

grep <...> ./binary.raw  --binary-files=text

Count

grep <...> | sort | uniq -c | awk '{print $1 "," $2}' | sed '1i count,<name>'

~~Example~~

grep -o -E '\b((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\b' -ai | sort | uniq -c | awk '{print $1 "," $2}' | sed '1i count,ip'

Regex search in JSON objects

This python program returns object ids that match a wordlist of regex. The search is restricted to a predefined list of interesting fields within the object.

import sys
import os 

if len(sys.argv) < 4:
    print("Usage: python search_string.py '<json_file_path>' regex_wordlist.txt fields.txt")
    sys.exit(1)

files = sys.argv[1]
regex_wordlist = sys.argv[2]
interesting_fields = sys.argv[3]
with open(regex_wordlist, "r") as file: search_regex = [line.strip() for line in file]
with open(interesting_fields, "r") as file: search_fields = [line.strip() for line in file]

all_commands = []
print(search_fields, "\n\n")

for regex in  search_regex :
        #TODO: replace ".users[]" with you object array selector
        search = "echo '#regex:"+regex+"'; cat "+files+" | jq -C '[.users[] | select( "
        array = []
        regex = regex.replace('\\','\\\\')
        for s in search_fields:
                array.append('('+s+' and ('+s+' | test("'+regex+'";"i")))')
        search += ' or '.join(array)
        search += ")] | .[].id'"
        all_commands.append(search)
        print(search)
print("\n\n#results:>")
for cmd in all_commands  :
        os.system(cmd)

# regex_wordlist.txt
API_KEY
pass[a-ZA-Z0-9]+
ADMIN[0-9]{2,3}

# fields.txt
.id
.value
.info.comment
.options[0].details

Data Decoding

Base64

<string> | base64 -d

cat <file> | grep -oE "[A-Za-z0-9+/]{4}|[A-Za-z0-9+/]{3}=|[A-Za-z0-9+/]{2}==" | base64 -d

Hexadecimal

echo "<hexadecimal>" | xxd -r -p > output

Data Extraction

Office Suite Files

Microsoft Office files use the OLE2 format. OOXML documents (.docx, .xlsm, etc.) supported by MS Office use zip compression to store content. Macros embedded in OOXML files are stored in the OLE2 binary file found in the zip archive.

OLE2 Objects

An OLE (Object Linking and Embedding) object is an external file (document, graphic, or video) created using an external application and inserted into another application.

# lister les flux OLE2
oledump <file>
# Estraction du flux <s>
oledump -s <s> -v <file>

RTF Objects

RTF documents do not support macros but can contain embedded files as OLE1 objects

rtfdump <file>
# lister les groupes dans le fichier 
rtfdump <file> -f O
# extraire l'objet du groupe <g>
rtfdump <file> -s 5 -H -d > out.bin

PDF

Magic Bytes : `0x255044462D` = `%PDF-`

End Bytes: `0x49454E44` = `IEND`

🗎 Structure

Understanding PDF Files A PDF file consists of objects linked together by a dictionary.

Scanning the Object Dictionary

pdfid <file>

Searching for Malformed Objects

peepdf -fl <file>

Compressed Archives

PKZIP / APK

Magic Bytes : `0x504B` = `PK`

Magic Bytes (archive vide) : `0x504B0506`

Tools : `unzip`, `apktool`

GZIP

Magic Bytes : `0x1F8B`

Tools : `unzip`

TAR

Magic Bytes : `0x7573746172`

Sometimes, you can list files in a ZIP archive even if it is encrypted.

Image Files

Extracting Image Properties

exiftool <image>

Data found after the End Bytes is ignored by most image viewers.

JPEG , JPG

Magic Bytes : `0xFFD8FFE0`

End Bytes: `0xFFD9`

PNG

Magic Bytes : `0x89504E470D0A1A0A` = `.PNG.`

End Bytes: `0x49454E44` = `IEND`

.

Checking PNG File Integrity

pngcheck <img>

pngcheck -v -f <img>

Excutables

MS-DOS, OS/2 or MS Windows

Magic Bytes : `0x4D5A` = `MZ`

Magic Bytes : `0x5A4D` = `MZ`

ELF

Magic Bytes : `0x7F454C46` = `.ELF`

File Recovery / Carving

sudo foremost -v -q -i <file/data> -o <output/directory> #quick mode
sudo foremost -v -i <file/data> -o <output/directory>

sudo photorec <file/data>

Sources

https://en.wikipedia.org/wiki/List_of_file_signatures

File analysis

Glossary

Magic Bytes = beginning of the file

End Bytes = end of the file

Metadata

File Type

Searching within files with grep / findstr

Grep

Findstr

Links

Mails

IP Addresses

Secrets

Linux Exploitation

Other text search

Searching in Binary Files

Count

Regex search in JSON objects

Data Decoding

Base64

Hexadecimal

Data Extraction

Office Suite Files

OLE2 Objects

RTF Objects

PDF

Magic Bytes : 0x255044462D = %PDF-

End Bytes: 0x49454E44 = IEND

🗎 Structure

Scanning the Object Dictionary

Searching for Malformed Objects

Compressed Archives

Magic Bytes : 0x504B = PK

Magic Bytes (archive vide) : 0x504B0506

Tools : unzip, apktool

Magic Bytes : 0x1F8B

Tools : unzip

Magic Bytes : 0x7573746172

Image Files

Extracting Image Properties

Magic Bytes : 0xFFD8FFE0

End Bytes: 0xFFD9

Magic Bytes : 0x89504E470D0A1A0A = .PNG.

End Bytes: 0x49454E44 = IEND

.

Checking PNG File Integrity

Excutables

Magic Bytes : 0x4D5A = MZ

Magic Bytes : 0x5A4D = MZ

Magic Bytes : 0x7F454C46 = .ELF

File Recovery / Carving

Sources

Similaire

Magic Bytes : `0x255044462D` = `%PDF-`

End Bytes: `0x49454E44` = `IEND`

Magic Bytes : `0x504B` = `PK`

Magic Bytes (archive vide) : `0x504B0506`

Tools : `unzip`, `apktool`

Magic Bytes : `0x1F8B`

Tools : `unzip`

Magic Bytes : `0x7573746172`

Magic Bytes : `0xFFD8FFE0`

End Bytes: `0xFFD9`

Magic Bytes : `0x89504E470D0A1A0A` = `.PNG.`

End Bytes: `0x49454E44` = `IEND`

Magic Bytes : `0x4D5A` = `MZ`

Magic Bytes : `0x5A4D` = `MZ`

Magic Bytes : `0x7F454C46` = `.ELF`