itsp

Overview

The Overview page is the front page of this documentation and provides a list with links to the respective subsection and some information on the estimations and descriptive statistics run.
Variables

The Variables page lists all variables found throught the data files found in the folder. Currently only Stata (.dta) and csv files are suported (and the latter are analysed using Stata as well). The variables are organised by their name and each name as a subpage that list detailed information about the respective variable in the relevant data file.
Files

The documentation can recognise a number of different file formats each get their own subsection in the Files page:
- Data
- Scripts
- Images
- Documents
- Other
Data

Recognised data files are examined for the varibales they contain and are linked to the script files where they are produced or used.
Scripts

Scripts are the central hub for the documentation that link datafiles and variables across inputs and outputs. Currently the script files are parsed for specific types of commands, including:
- System (changes to the logic and program environment)
- Manipulation (manipulation of variables)
- Input (commands that read data from files)
- Output (commands that produce files)
- Stat (commands that produce summary statistics and/or graphs)
- Estimations (commands that perform estimations)
- Call (commands that call other script files)
The classification of commands for Stata can be changed in the config/statacmdtype.txt configuration file. Statdoc tries to find the corresponding inputs and outputs for each script and lists them at the top of the page. You can help Statdoc in this task by using unique file names throughout the project. It is able to perform limited wildcard matching for filenames that include locals and globals if the remainer of the file name is sufficiently unique.

Scripts can contain documenting comments, which are marked with /** */ (not there are exactely two stars). These comments will be parsed for additional tags (e.g. @author) that will be added to the documentaion.

Statdoc also recognises when a log file is writen and will display the log file alongside the commands it has found (look for the the log button in the top right corner of the command list).
Images

Statdoc recognises image files and displays them if the relative path is intact (i.e. the statdoc folder must be in the root folder of the project).
Documents

Statdoc recognises Document files and can link them to script files (e.g. log files). In the future Statdoc might index the document file for more cross links (also see Tokens).
Other Files

Other Files list all other files currently not recognised/processed by Statdoc.
Tokens

The Index contains an alphabetic list of all n-gram tokens found throughout the directory. It is currently mainly limited to file names but might be expanded in the future. Then it would serve as the main index.
Prev/Next

These links point to the next or previous item (alphabetically) of a given type.
Frames/No Frames

These links show and hide the HTML frames. All pages are available with or without frames.
How to Make it Work Well

Statdoc tries hard to discover features of yout Statistics project by itself. There are a few things you can do to help though:
- General
- Use unique, descriptive and concise names for files and variables. Start names for related thins with the same string (e.g. same file with different data on the end).
- Scripts
- Use a documenting comment (starting with /** at the beginning of the file).
- Start your script by setting up a log file.
- If possible try to strictly seperate data manipulation and data analysis (differnt files).
- Use descriptive adn unique tokens for local variables.
- Avoid extensive abbreviation, use full variablenames.
- When using multiple input files in a loop, make sure there is a unique stub present that allows Statdoc to properly identify the files.
- Data
- Make sure data is properly labeled and in the appropriate format (double,float,int,str,...).
- Use xtset or tsset to identify your key variables.
How to Customise It

A lot of things are customisable for Statdoc. Starting with stylesheet.css, where variouse aspects of the display in html can be adjusted, e.g. the color of most elements. Furthermore all html pages are produced with the templates in the templates directory, this allows a lot of freedom in defining what is displayed in which order and where. In the same folder there are also a number of .do files that govern how information is extracted from Stata files. Finally the config folder holds some configuration files which allow customisations of how data is read.

This help file applies to documentation generated using the default configuration.< /div>

Bugs, Todos and Future Features

The following is a loose collection of things that do not work (yet), need to be revised and ideas for future functionality.

Todos

CRITICAL TO 1.0
+ add some information about varibales to static main screen (pics?)
+ rethink variables-summary.vm (pictures instead of colortable)
+ Add licence (apache) to resource files
- rework Console.java
+ think about good command line options and document them
- upload to github
- implement downloadable Stata package
+ Write a proper help doc (uptodate).
+ readme.md
- clean javadoc

BUGS
- StataParseDoFileTask is inaccurate if variable named after keyword (eg weight)
- drop is (has to be) recorded as a manipulation, promoting dropped variables
- files marked as not recognised when they are just not parsed (documents)
- height of source/log iframe not 100% on Windows and Linux

CHECK
+ w3 validator
- Windows 7: Chrome, Firefox, IE 9+
- Android: BuiltIn
- iOS iPhone and iPad Safari (iOS 7, maybe 6)
- Mac OSX: CHrome, Firefox, Safari
- Linux: Firefox, Chrome

ITEMS/DISPLAY

VARIABLES
+ work with csv files

FILES
- implement FOLDERS and have them as a category, make sure they work ok

data
- work on the do files in template
- use tsset to derive more info about data 

scripts
- work in .sh files as well
- work in .m files as well
- work in .log/.html

dofiles
- dofile deal with blocks (program loops {}) better (start/end)
- add a warning if an index goes out of bounds
- doc on top: general info (allow more space before)
+ doc inbetween, document program, loops or single lines
- link images to graph lines and eventually to variables (essential)

images
- Display thumbnails in overviews (can be small sized large versions)
- work out .eps files (does not really work)
- deal with unavailable source files (i.e. check that the folder is a subfolder)

documents
- parse/tokenize if relevant (i.e. only recognise tokens that exist from before)

TOKENS
- make sure to capture the n-gram level

FLOW (currently no top-level for this)
+ focus on scripts
- tree for each file that does not have a "parent" and unused
- what about loops?
- work out relationships between variables
    e.g. siblings = used together in regression/stats
        dependvar = marker for first variable in estimation commands 
        manipulate = together in manipulation, this could be directional

IDEAS
- display icons for files (maybe also other stuff)
- have a daemon mode that reacts to file changes
- have a server mode that allows you to actively change things from within
- Make sure only to update the relevant files (at least for stata)
- fix whether update or rebuild
- probably everything should go into a database after all, easier to search
- actually best the DB should use something like hibernate
- run as a service on AWS with a dropbox folder
- run parsed do files that output to line dependen log files

PROBLEMS
+ deal with too many files (reduced tokens to one page per starting letter)
- deal with large files for read in
- htmlCompressor

How This Documentation Is Organized

Overview

Variables

Files

Data

Scripts

Images

Documents

Other Files

Tokens

Prev/Next

Frames/No Frames

How to Make it Work Well

How to Customise It

Bugs, Todos and Future Features

Todos