The Overview page is the front page of this documentation and provides a list with links to the respective subsection and some information on the estimations and descriptive statistics run.
The Variables page lists all variables found throught the data files found in the folder. Currently only Stata (.dta) and csv files are suported (and the latter are analysed using Stata as well). The variables are organised by their name and each name as a subpage that list detailed information about the respective variable in the relevant data file.
The documentation can recognise a number of different file formats each get their own subsection in the Files page:
Recognised data files are examined for the varibales they contain and are linked to the script files where they are produced or used.
Scripts are the central hub for the documentation that link datafiles and variables across inputs and outputs. Currently the script files are parsed for specific types of commands, including:
The classification of commands for Stata can be changed in the config/statacmdtype.txt configuration file. Statdoc tries to find the corresponding inputs and outputs for each script and lists them at the top of the page. You can help Statdoc in this task by using unique file names throughout the project. It is able to perform limited wildcard matching for filenames that include locals and globals if the remainer of the file name is sufficiently unique.
Scripts can contain documenting comments, which are marked with /** */ (not there are exactely two stars). These comments will be parsed for additional tags (e.g. @author) that will be added to the documentaion.
Statdoc also recognises when a log file is writen and will display the log file alongside the commands it has found (look for the the log button in the top right corner of the command list).
Statdoc recognises image files and displays them if the relative path is intact (i.e. the statdoc folder must be in the root folder of the project).
Statdoc recognises Document files and can link them to script files (e.g. log files). In the future Statdoc might index the document file for more cross links (also see Tokens).
Other Files list all other files currently not recognised/processed by Statdoc.
The Index contains an alphabetic list of all n-gram tokens found throughout the directory. It is currently mainly limited to file names but might be expanded in the future. Then it would serve as the main index.
These links point to the next or previous item (alphabetically) of a given type.
These links show and hide the HTML frames. All pages are available with or without frames.
Statdoc tries hard to discover features of yout Statistics project by itself. There are a few things you can do to help though:
A lot of things are customisable for Statdoc. Starting with stylesheet.css, where variouse aspects of the display in html can be adjusted, e.g. the color of most elements. Furthermore all html pages are produced with the templates in the templates directory, this allows a lot of freedom in defining what is displayed in which order and where. In the same folder there are also a number of .do files that govern how information is extracted from Stata files. Finally the config folder holds some configuration files which allow customisations of how data is read.
The following is a loose collection of things that do not work (yet), need to be revised and ideas for future functionality.
CRITICAL TO 1.0 + add some information about varibales to static main screen (pics?) + rethink variables-summary.vm (pictures instead of colortable) + Add licence (apache) to resource files - rework Console.java + think about good command line options and document them - upload to github - implement downloadable Stata package + Write a proper help doc (uptodate). + readme.md - clean javadoc BUGS - StataParseDoFileTask is inaccurate if variable named after keyword (eg weight) - drop is (has to be) recorded as a manipulation, promoting dropped variables - files marked as not recognised when they are just not parsed (documents) - height of source/log iframe not 100% on Windows and Linux CHECK + w3 validator - Windows 7: Chrome, Firefox, IE 9+ - Android: BuiltIn - iOS iPhone and iPad Safari (iOS 7, maybe 6) - Mac OSX: CHrome, Firefox, Safari - Linux: Firefox, Chrome ITEMS/DISPLAY VARIABLES + work with csv files FILES - implement FOLDERS and have them as a category, make sure they work ok data - work on the do files in template - use tsset to derive more info about data scripts - work in .sh files as well - work in .m files as well - work in .log/.html dofiles - dofile deal with blocks (program loops {}) better (start/end) - add a warning if an index goes out of bounds - doc on top: general info (allow more space before) + doc inbetween, document program, loops or single lines - link images to graph lines and eventually to variables (essential) images - Display thumbnails in overviews (can be small sized large versions) - work out .eps files (does not really work) - deal with unavailable source files (i.e. check that the folder is a subfolder) documents - parse/tokenize if relevant (i.e. only recognise tokens that exist from before) TOKENS - make sure to capture the n-gram level FLOW (currently no top-level for this) + focus on scripts - tree for each file that does not have a "parent" and unused - what about loops? - work out relationships between variables e.g. siblings = used together in regression/stats dependvar = marker for first variable in estimation commands manipulate = together in manipulation, this could be directional IDEAS - display icons for files (maybe also other stuff) - have a daemon mode that reacts to file changes - have a server mode that allows you to actively change things from within - Make sure only to update the relevant files (at least for stata) - fix whether update or rebuild - probably everything should go into a database after all, easier to search - actually best the DB should use something like hibernate - run as a service on AWS with a dropbox folder - run parsed do files that output to line dependen log files PROBLEMS + deal with too many files (reduced tokens to one page per starting letter) - deal with large files for read in - htmlCompressor