How to Automate your Research with DataMiner

Run multiple operations with our "no-code" application

In this article I will show you how to use the DataMiner App to run dozens of backtest with a single click. But first, a little introduction on DataMiner:

With DataMiner you can:

  • Run multiple backtest variations of screens and ranking systems

  • Retrieve derived data to train AI system (ranks, technical factors, z-scores)

  • Download raw point in time data (requires data license from FactSet or S&P Compustat)

  • Write simple, human-readable syntax to configure the operations

  • Lower latency since the overhead of the website is removed

NOTE: DataMiner is an open-source project. Let us know if you wish to contribute to the official release. You will find the source code repository here.

To get started you must have an API key. Click on the user account icon on the top right and go to DataMiner & API and click Create Key. You will need both the ID and the KEY. This is your private, personal key and anyone with the key will be able to run requests using your account. If you think your key was compromised simply create a new one and delete the old one.

Next you need to download the DataMiner app. You can download the latest DataMiner version, samples, and other documentation from this Dropbox folder. You can find this download link and several others in the same page where you just created the API key. Go to the folder with for the latest release and download the version for your PC and install it. There are versions for Linux, MacOS and Windows.

The first time you run the app you will be asked for the API key and id. Once you entered them you can run your first datamining script like this RankPerformance script below.

    Operation: RankPerformance
    On Error:  Stop

Default Settings:
    PIT Method: Prelim 
    Buckets: 5
    Start Date: 2005-01-01
    End Date: 2020-01-01
    Rebalance Frequency: 4Weeks
    Benchmark: IWM #Russell 2000
            - Sector = MATERIALS
        Starting Universe: Prussell 2000
        Name: Price to Book
            Formula: Pr2BookQ
            Lower is Better: true
        Name: Price to Sales
            Formula: Pr2SalesTTM
            Lower is Better: true

These instructions operate on the Russell 2000 universe stocks that are in the MATERIALS sector (the Russell universes in our system are generated by us which is why we call them Prussell).

They will run two iterations for "Price to Sales TTM" and "Price to Book Q" (Q is latest Quarter and TTM is Trailing Twelve Months). Each iteration runs 6 backtests; 5 for each percentile bucket and 1 for the universe; for a total of 12 backtests.

Click on "Execute" and after a short delay you will see this output. Below is the table for "Price to Sales" with several statistics. It shows for example that the top 20 percent of the stocks (bucket 5) returned 15.44% annualized during the test period. Is also shows that a the average number of positions held for each bucket was consistent, around 22 stocks. When you see very different average number of stocks in the buckets it may be a sign that there are a lot of N/As in the factors you chose.

Further Reading

To see all the DataMiner operations see DataMiner Operations 

Examples are also available from within the DataMiner application from the Samples menu item. These sample files come with the release package. If you want to modify the sample files it's best to make copies since the originals files will be overwritten if you re-install DataMiner.