XGBoost Feature Interactions Reshaped
Xgbfir is a XGBoost model dump parser, which ranks features as well as feature interactions by different metrics.
This project started as a python port of Xgbfi - XGBoost Feature Interactions & Importance project. Thanks Far0n for great tool and idea!
Some basic description from Xgbfi project page is presented here.
You have several options to install Xgbfir.
You can install using the pip package manager by running
pip install xgbfir
Clone the repo and install:
git clone https://github.com/limexp/xgbfir.git cd xgbfir sudo python setup.py install
Or download the source code by pressing 'Download ZIP' on this page. Install by navigating to the proper directory and running
sudo python setup.py install
You can use Xgbfir as a python function or as a CLI (Command Line Interface) tool.
You can produce feature interactions file without saving any model dump file beforehand:
import xgbfir xgbfir.saveXgbFI(booster) # booster is a XGBoost booster
List of saveXgbFI function parameters:
Take a look at this example of usage (available in examples):
from sklearn.datasets import load_iris, load_boston import xgboost as xgb import xgbfir # loading database boston = load_boston() # doing all the XGBoost magic xgb_rmodel = xgb.XGBRegressor().fit(boston['data'], boston['target']) # saving to file with proper feature names xgbfir.saveXgbFI(xgb_rmodel, feature_names=boston.feature_names, OutputXlsxFile='bostonFI.xlsx') # loading database iris = load_iris() # doing all the XGBoost magic xgb_cmodel = xgb.XGBClassifier().fit(iris['data'], iris['target']) # saving to file with proper feature names xgbfir.saveXgbFI(xgb_cmodel, feature_names=iris.feature_names, OutputXlsxFile='irisFI.xlsx')
Xgbfir can be run as a console tool with the following command:
Use the following command for help:
XGBoost model dump must be created before running xgbfir.
To dump a model with proper feature names use the following code:
booster.feature_names = list(feature_names) # set names for XGBoost booster booster.dump_model('xgb.dump', with_stats=True)
Please see CONTRIBUTING.md.
Feel free to open issues or pull requests.