HDIM is a toolkit for working with high-dimensional data that emphasizes speed and statistical guarantees. Specifically, it provides tools for working with the LASSO objective function.
HDIM provides traditional iterative solvers for the LASSO objective function including ISTA, FISTA and Coordinate Descent. HDIM also provides FOS, the Fast and Optimal Selection algorithm, a novel new method for performing high-dimensional linear regression.
Check us out on GitHub!What sets HDIM apart? Take a look!
As fast as the leading Elastic net regression package.
Ready to go on Linux, Mac and Windows.
Available in C++, Python, R and even JavaScript!
Carries the MIT license, and can be used in commercial applications!
#include < eigen3/Eigen/Dense >
#include "x_fos.hpp"
int main(int argc, char *argv[]) {
(void)argc;
(void)argv;
unsigned int N = 200; // Number of observations
unsigned int P = 500; // Number of variables
// Generate a random 'data' matrix
Eigen::Matrix< double, Eigen::Dynamic, Eigen::Dynamic > X =
Eigen::Matrix< double, Eigen::Dynamic, Eigen::Dynamic >::Random( N , P );
// Generate a random 'predictors' vector
Eigen::Matrix< double, Eigen::Dynamic, 1 > Y =
Eigen::Matrix< double, Eigen::Dynamic, 1 >::Random( N, 1 );
// Initialize the FOS functor
hdim::X_FOS fos;
// Run the FOS algorithm using the data matrix and vector of predictors.
// The third arguement is the type of iterative solver that we want to
// use, in this cas Coordinate Descent ( CD )
fos( X, Y, hdim::SolverType::cd );
// Return the computed regression coefficients
Eigen::Matrix< T, Eigen::Dynamic, 1 > fos_fit = fos.ReturnCoefficients();
// Return the intercept term
double intercept = fos.ReturnIntercept();
std::cout << "Intercept: " << intercept << "\n Coefficients: " << fos_fit << std::endl;
}
R
library(HDIM)
dataset <- matrix(rexp(200, rate=.1), ncol=20)
yinput <- dataset[, 1, drop = FALSE]
xinput <- dataset[, names(dataset) != names(yinput)]
fos_fit <- HDIM::FOS( as.matrix(xinput), as.matrix(yinput), "cd" )
Python
import hdim
import numpy as np
def main():
N = 200 # Number of observations
P = 500 # Number of variables
# Generate synthetic data matrix
X = np.random.rand( 200, 500 )
# Use first column of data matrix as vector of predictors
Y = X[:,0]
# Initialize FOS functor
fos_test = hdim.X_FOS_d()
# Run the FOS Algorithm using the data matrix and vector of predictors
# The third function argument is the type of th iterative solver used
# in this case Coordinate Descent ( CD )
fos_test( X, Y, hdim.SolverType_cd )
fos_fit = fos_test.ReturnCoefficients() # Get the computed coefficients
intercept = fos_test.ReturnIntercept() # And the intercept
if __name__ == "__main__":
main()
JavaScript
var tools = require( "./fos.js" )
var fos = new tools.FOS();
var vectorized_X = new tools.VectorDouble();
var n = 20;
var p = 50;
for ( i = 0; i < n * p ; i++ ) {
vectorized_X.push_back( Math.random() );
}
var Y = new tools.VectorDouble();
for ( i = 0; i < n * 1 ; i++ ) {
Y.push_back( Math.random() );
}
fos.Run( vectorized_X, Y, "cd" )
var coefs = fos.ReturnCoefficients();
var intercept = fos.ReturnIntercept();
console.log( "Intercept: ", intercept );
console.log( "Beta Vector:" );
for ( i = 0; i < coefs.size() ; i++ ) {
console.log( coefs.get(i) );
}
The following combinations of implementation languages are OS's are supported.
OS | Supported Languages |
---|---|
Linux | C++, R, Python, JavaScript |
Mac | C++, Python, JavaScript |
Windows | C++, R, JavaScript |
$PKG_DIR/Python_Wrapper
, where PKG_DIR is the root directory of the cloned repository.chmod +x ./nix_build.sh
).$PKG_DIR/Python_Wrapper
, where PKG_DIR is the root directory of the cloned repository.nix_build.sh
and mark it as executable ( chmod +x ./nix_build.sh
).$PKG_DIR/Python_Wrapper
, where PKG_DIR is the root directory of the cloned repository.os_x_build.sh
and mark it as executable ( chmod +x ./os_x_build.sh
).$PKG_DIR/Python_Wrapper
, where PKG_DIR is the root directory of the cloned repository.win_build.ps1
and run it using PowerShell.We provide documentation for both the native C++ code base that makes up the core of the HDIM package, and the mathematical theory that our algorithms rely on.
Go to the code docs. Go to the theory docs.