2023-11-21 14:37:16 +00:00
|
|
|
<!DOCTYPE html>
|
|
|
|
<html>
|
|
|
|
<head>
|
|
|
|
<meta charset = "utf-8">
|
|
|
|
<title>Ghidra Filesystem Storage</title>
|
|
|
|
<style media="screen" type="text/css">
|
|
|
|
pre { margin-left: 120px; }
|
|
|
|
li { margin-left: 90px; font-family:times new roman; font-size:14pt; }
|
|
|
|
h1 { color:#000080; font-family:times new roman; font-size:36pt; font-style:italic; font-weight:bold; text-align:center; }
|
|
|
|
h2 { margin: 10px; margin-top: 40px; color:#984c4c; font-family:times new roman; font-size:18pt; font-weight:bold; }
|
|
|
|
h3 { margin-left: 37px; margin-top: 20px; color:#0000ff; font-family:times new roman; font-size:14pt; font-weight:bold; }
|
|
|
|
h4 { margin-left: 70px; margin-top: 10px; font-family:times new roman; font-size:14pt; font-weight:normal; }
|
|
|
|
h5 { margin-left: 70px; margin-top: 10px; font-family:times new roman; font-size:14pt; font-weight:normal; }
|
|
|
|
h6 { color:#000080; font-family:times new roman; font-size:18pt; font-style:italic; font-weight:bold; text-align:center; }
|
|
|
|
p { margin-left: 90px; font-family:times new roman; font-size:14pt; }
|
|
|
|
body {
|
|
|
|
counter-reset: section;
|
|
|
|
}
|
|
|
|
h2 {
|
|
|
|
counter-reset: topic;
|
|
|
|
}
|
|
|
|
h3 {
|
|
|
|
counter-reset: rule;
|
|
|
|
}
|
|
|
|
h2::before {
|
|
|
|
counter-increment: section;
|
|
|
|
content: counter(section) ": ";
|
|
|
|
}
|
|
|
|
h3::before {
|
|
|
|
counter-increment: topic;
|
|
|
|
content: counter(section) "." counter(topic) " ";
|
|
|
|
}
|
|
|
|
h4::before {
|
|
|
|
counter-increment: rule;
|
|
|
|
content: counter(section) "." counter(topic) "." counter(rule) " ";
|
|
|
|
}
|
|
|
|
|
|
|
|
div.info {
|
|
|
|
margin-left: 10px; margin-top: 80px; font-family:times new roman; font-size:18pt; font-weight:bold;
|
|
|
|
}
|
|
|
|
|
|
|
|
</style>
|
|
|
|
</head>
|
|
|
|
<body>
|
|
|
|
<H1>Ghidra Filesystem Storage</H1>
|
|
|
|
|
|
|
|
<H2>Introduction</H2>
|
|
|
|
|
|
|
|
<P>The purpose of this document is to provide technical details of the file storage scheme(s) employed by
|
|
|
|
both Ghidra projects and Ghidra Server repositories. As of this writing Ghidra has employed two
|
|
|
|
different local filesystem storage schemes:</P>
|
|
|
|
|
|
|
|
<UL>
|
|
|
|
<LI>Indexed Filesystem (current)</LI>
|
|
|
|
<LI>Mangled Filesystem (legacy, pre-9.0 PUBLIC release)</LI>
|
|
|
|
</UL>
|
|
|
|
|
|
|
|
<P>In addition to providing details of each storage scheme, some details are provided about how project/repository database
|
|
|
|
files are stored within each filesystem as well as some troubleshooting tips to aid in any manual interventions
|
|
|
|
which may be required.</P>
|
|
|
|
|
|
|
|
<P>At this point in time all filesystem implementations rely on the use of a property file for each project defined
|
|
|
|
file (<I>*.prp</I>). For all database type project files there will be a corresponding subdirectory (<I>~*.db</I>)
|
|
|
|
which is used to store all content related to a project file database (e.g., ProgramDB). The naming and organization
|
|
|
|
of these two differ significantly between the two filesystem implementations.
|
|
|
|
|
|
|
|
<H3>Name Mangling/Encoding</H3>
|
|
|
|
|
|
|
|
<P>Ghidra uses the following name mangling/encoding for both Ghidra Server project repository subdirectories as well as
|
|
|
|
project folder and file naming within the legacy Mangled Filesystem. The goal of this encoding scheme is to preserve case-sensitive
|
|
|
|
naming while allowing storage on a single-case or case-insensitive native filesystem. This is achieved by mutating the original
|
|
|
|
name with the following character substitutions:</P>
|
|
|
|
|
|
|
|
<UL>
|
|
|
|
<LI>All uppercase letters replaced with an underscore ('_') followed by the lowercase letter, and</LI>
|
|
|
|
<LI>All underscore ('_') characters are replaced with two underscores ("__")
|
|
|
|
</UL>
|
|
|
|
|
|
|
|
<P>For a Ghidra Server repository named "My_Project", the resulting filesystem storage folder named "_my___project"
|
|
|
|
would appear within the server repositories directory.</P>
|
|
|
|
|
|
|
|
<H3>Ghidra Project Storage</H3>
|
|
|
|
|
|
|
|
<P>Ghidra projects employ multiple filesystem storage directories within the top-level project directory (<I>*.rep</I>):</P>
|
|
|
|
|
|
|
|
<UL>
|
|
|
|
<LI>Local project file storage (<I>idata</I>) provides the primary storage area for non-versioned project files
|
|
|
|
and for versioned files checked-out by the user.</LI>
|
|
|
|
<LI>Versioned project file storage (<I>versioned</I>) is present with private non-shared projects only and provides
|
|
|
|
version control storage similar to a Ghidra Server repository.</LI>
|
|
|
|
<LI>User data storage (<I>user</I>) provides non-versioned storage of user-specific data associated with a specific project file.</LI>
|
|
|
|
</UL>
|
|
|
|
|
|
|
|
<H3>Ghidra Server Storage</H3>
|
|
|
|
|
|
|
|
<P>The Ghidra Server employs a separate filesystem storage directory for each project repository using a mangled
|
|
|
|
name (see <I>Name Mangling/Encoding</I> above). While all newly created project repositories will use the latest
|
|
|
|
Indexed Filesystem storage scheme Ghidra continues to support the legacy Mangled Filesystem which may be in use
|
|
|
|
by older Ghidra Server installations. The <I>svrAdmin</I> command provides the ability to migrate an older
|
|
|
|
Mangled Filesystem to the current Index Filesystem (see <I>server/serverREADME.html</I>).</P>
|
|
|
|
|
|
|
|
<H2>Indexed Filesystem (current)</H2>
|
|
|
|
|
2024-10-03 09:31:02 +00:00
|
|
|
<P>This filesystem overcomes the project file-path length limitations inherent to the legacy Mangled Filesystem and
|
2023-11-21 14:37:16 +00:00
|
|
|
utilizes an index file to store project file-paths and the corresponding 8-digit hexadecimal identifier for
|
|
|
|
each
|
|
|
|
(e.g., <I>00001234.prp</I> / <I>~00001234.db</I>). The following files are used to manage the filesystem content
|
|
|
|
and are located at the root of the filesystem storage directory.</P>
|
|
|
|
|
|
|
|
<UL>
|
|
|
|
<LI>Index File (<I>~index.dat<I>) - current index file
|
|
|
|
<LI>Backup Index File (<I>~index.bak</I>) - backup of current index file</LI>
|
|
|
|
<LI>Temporary Index File (<I>~index.tmp</I>) - temporary index file created during update</LI>
|
|
|
|
<LI>Index Journal File (<I>~journal.dat</I>) - journal or recent changes not yet reflected by index</LI>
|
|
|
|
<LI>Backup Journal File (<I>~journal.bak</I>) - backup of journal file created during incremental update</LI>
|
|
|
|
<LI>Index Lock File (<I>~index.lock</I>) - index lock used to coordinate index access and modification</LI>
|
|
|
|
</UL>
|
|
|
|
|
|
|
|
<P><B>Index Rebuild</B> - If the index file becomes corrupt it may be easily rebuilt while the associated project
|
|
|
|
is closed or Ghidra Server stopped to avoid file access during the repair process. While the filesystem is not
|
|
|
|
active/in-use all of the index
|
|
|
|
related files mentioned above may be manually deleted from the root of the appropriate filesystem storage directory
|
|
|
|
(see Ghidra Project Storage and Ghidra Server Storage above). Once the filesystem store is
|
|
|
|
started (e.g., project opened or Ghidra Server started) the missing index will trigger an automatic rebuild of the
|
|
|
|
index based upon the details provided by each property file contained within (<I>*.prp</I>).
|
|
|
|
|
|
|
|
<P><B>Locating Project File Storage</B> - Locating individual project files on disk requires interpretation of
|
|
|
|
the index file (<I>~index.dat</I>) and traversing the numbered storage folders appropriately. When locating a project
|
|
|
|
file within the index it is important to know both the full Ghidra project directory path and project filename.
|
2024-10-03 09:31:02 +00:00
|
|
|
If the project filename is unique you can simply search for the filename within the index, otherwise you will have
|
2023-11-21 14:37:16 +00:00
|
|
|
to search for project folder path first. Sample <I>~index.dat</I> file:</P>
|
|
|
|
|
|
|
|
<PRE>
|
|
|
|
VERSION=1
|
|
|
|
/
|
|
|
|
00000003:myFile:a701ee4b1c71321380792951888
|
|
|
|
/A
|
|
|
|
00000100:anotherFile:a701ee4920f909328022843906
|
|
|
|
00000105:yetAnotherFile:a701ee48743913628104779815
|
|
|
|
/A/B
|
|
|
|
00001234:myFile:a701ee48019210920546276045
|
|
|
|
00001200:myFile.1:a701ee48491297248400412904
|
|
|
|
/A/B/C
|
|
|
|
00000004:someFile:a701ee4a23590324427866017
|
|
|
|
NEXT-ID:1250
|
|
|
|
MD5:d41d8cd98f00b204e9800998ecf8427e
|
|
|
|
</PRE>
|
|
|
|
|
|
|
|
<P>Once the project file of interest has been located within the index file and the corresponding 8-digit hex file number identified,
|
|
|
|
the storage subfolder name is derived from the 2nd and 3rd digits of the file number. Example file number "00001234" will be contained
|
|
|
|
with the subfolder "12". Within this subfolder the "00001234.prp" file should exist as will all other numbered files which have the
|
|
|
|
same 2nd/3rd digits. The storage hierarchy for the above index file would have the following hierarchy:</P>
|
|
|
|
|
|
|
|
<PRE>
|
|
|
|
./~index.dat
|
|
|
|
./00/
|
|
|
|
00000003.prp
|
|
|
|
~00000003.db/
|
|
|
|
00000004.prp
|
|
|
|
~00000004.db/
|
|
|
|
./01/
|
|
|
|
00000100.prp
|
|
|
|
~00000100.db/
|
|
|
|
00000105.prp
|
|
|
|
~00000105.db/
|
|
|
|
./12/
|
|
|
|
00001200.prp
|
|
|
|
~00001200.db/
|
|
|
|
00001234.prp <-- /A/B/myFile property file
|
|
|
|
~00001234.db/ <-- /A/B/myFile database directory
|
|
|
|
</PRE>
|
|
|
|
|
|
|
|
<H2>Mangled Filesystem (legacy)</H2>
|
|
|
|
|
|
|
|
<P>This filesystem utilizes mangled naming for all project folders and files and follows the same hierarchy as the project.
|
|
|
|
For example, a project file with the path <I>"/A_1/B_1/myFile"</I> would be found stored as
|
|
|
|
<I>"./_a__1/_b__1/my_file.prp"</I> and <I>"./_a__1/_b__1/~my_file.db/"</I>. Due to file path-length limitations of native
|
|
|
|
filesystems the use of this storage scheme is no longer used by default and has been retained only for backward
|
|
|
|
compatibility with older projects and repositories.
|
|
|
|
|
|
|
|
<H2>Removal of Corrupt Files</H2>
|
|
|
|
|
|
|
|
<P>If a project or Ghidra Server repository contains a corrupt file it may not be possible to remove the file via the
|
|
|
|
Ghidra GUI or API. While a detailed triage of a corrupt file may be possible by the Ghidra Development Team, such files
|
|
|
|
may need to be removed after being copied for triage. For a shared repository this will require stopping the Ghidra Server
|
|
|
|
and digging into the appropriate named repository directory. For a local project simply ensure that the project is
|
|
|
|
not in use.</P>
|
|
|
|
|
|
|
|
<P>For the local project case it will be neccessary to isolate the storage issue since it could be caused by the
|
|
|
|
local project store (<I>*.rep/idata/</I>) or versioned repository (Ghidra Server or private non-shared (<I>*.rep/versioned/</I>).
|
|
|
|
The Ghidra Server case can easily be identified by creating another temporary shared project to the same shared repository
|
|
|
|
and check the behavior of the project file in question. If the same behavior is observed the issue is likely on the server.
|
|
|
|
If you need assistance identifying the source of the bad behavior or recommended resolution please submit a Ghidra
|
|
|
|
trouble ticket.<P>
|
|
|
|
|
|
|
|
<P>As discussed for each filesystem above, the specific <I>*.prp</I> file and <I>~*.db/</I> directory should be
|
2024-10-03 09:31:02 +00:00
|
|
|
identified and copied for triage. Keeping a copy will enable triage and may enable restoring the file in the
|
|
|
|
future if possible. Once this file and corresponding directory have been copied they may be removed from the filesystem.
|
2023-11-21 14:37:16 +00:00
|
|
|
For the indexed filesystem case the index related files can be deleted which will trigger a rebuild of the index
|
|
|
|
(see Indexed Filesystem above).
|
|
|
|
|
|
|
|
</body>
|
2024-10-03 09:31:02 +00:00
|
|
|
</html>
|