The python-weka-wrapper3 package makes it easy to run
Weka <http://www.cs.waikato.ac.nz/~ml/weka/>
algorithms and filters from
within Python 3. It offers access to Weka API using thin wrappers around JNI
calls using the python-javabridge <https://pypi.python.org/pypi/python-javabridge>
package.
Forum for project at:
https://groups.google.com/forum/#!forum/python-weka-wrapper
install_packages
(module: weka.core.packages
) now no longer interprets
the installation message returned when installing from URL or local zip file as
failure to install; also outputs any installation message now in the console;
now has flags for fail_fast
mode (first package that fails stops installation process)
and whether to return details
(dict per package rather than just a bool for all packages)install_package
(module: weka.core.packages
) can return either a
boolean flag of success or detailed informationinstall_packages
and install_missing_packages
of module weka.core.packages
now
allow a list of package names instead of tuples (name, version), assuming latest
as versionget_jclass
in module weka.core.classes
can handle primitive classes now as well
(eg int
-> java.lang.Integer.TYPE
)get_non_public_field
and call_non_public_method
in module weka.core.classes
allow accessing private/protected fields and calling private/protected methods of Java objects,
which avoids having to sub-class classes to get public access to them (NB: only works as long
as the security manager allows that)split_commandline
method to module weka.core.classes
, which splits a command-line
into classname and option list tupleInstances
class (module: weka.core.dataset
) now supports slicingplot_xmlbif_graph
and xmlbif_to_dot
to module weka.plot.graph
for plotting
XML BIF graphs generated by BayesNet using GraphVizplot_graph
to module weka.plot.graph
to plot dot or XML BIF graphslogging_level
parameter to the start
method of the weka.core.jvm
module, enabling the user
to turn off debugging output in an easy way (https://github.com/fracpete/python-weka-wrapper3/issues/40)cv_splits
to class Instances
from module weka.core.dataset
to return a list of
train/test tuples as used by cross-validationTester
class (module: weka.experiments
) now has an option to swap columns/rows for comparing
datasets rather than classifiersSimpleExperiment
class and derived classes (module: weka.experiments
) now have the additional
parameters in the constructor: class_for_ir_statistics, attribute_id, pred_target_columnis_installed
(module: weka.core.packages
) now can check whether a specific version is installedpww-packages
entry point to allow managing of Weka packges from the command-line
(actions: list/info/install/uninstall/suggest/is-installed)JavaObject.new_instance
in module weka.core.classes
now automatically
installs packages based on suggestions if the JVM was started with the auto_install
flag enabled.test_model_once
of class Evaluation
(module: weka.classifiers
) now has the
additional parameter store
, which allows the recording of the predictions (necessary
for statistics like AUC)create_instances_from_lists
and create_instances_from_matrices
(module weka.core.dataset
) now allow the specification of column names,
for input and output variables.DistanceFunction
class (module weka.core.distances
)
(thanks to Martin Trat, https://github.com/fracpete/python-weka-wrapper3/pull/39)JavaArray
class (module: weka.core.classes
) now has __str__
and __repr__
methods that output
classname and sizepython-javabridge
, the new name (fork?) of the javabridge
libraryPackage.__str__
(weka.core.packages
module) method now returns a string rather than printing the name/versionto_numpy(...)
methods to Instance
and Instances
classes (module weka.core.dataset
)
to make it easy to obtain a numpy array from the Weka datasethelp_for
to weka.core.classes
module to generate a help screen for an weka.core.OptionHandler
class using just the classname.to_help
method of the weka.core.classes.OptionHandler
class now allows to tweak the generated output a
bit better (e.g., what sections to output).plot_classifier_errors
(module weka.plot.classifiers
) now plots the diagonal after adding all the plot data to
get the right limitsweka.core.distances
module for distance functions, with DistanceFunction
base classavg_silhouette_coefficient
method to weka.clusterers
to calculate the average silhouette coefficientPackage
class of the weka.core.packages
module now has a version
property to quickly access the version
which is stored in the meta-data; the metadata
property now returns a proper Python dictionaryweka.core.packages
module: install_packages
to install more than one package,
install_missing_package
and install_missing_packages
to install one or more packages if missing
(can automatically stop the JVM and exit the process), uninstall_packages
to remove more than one package in
one operationASEvaluation
class in the weka.attribute_selection
module now offers the following methods
for attribute transformers like PCA: transformed_header
, transformed_data
, convert_instance
weka.core.classes.JavaObject
are now serializable via picklecopy_structure
to the weka.core.dataset.Instances
class to quickly
get the header of a datasetheader
to the following classes that returns the training data structure:
ASEvaluation
, ASSearch
, Associator
, Classifier
, Clusterer
, TSForecaster
weka.core.serialization
have been moved into weka.core.classes
, with the
following methods getting the serialization_
prefix: write
, write_all
, read
, read_all
classes.new_instance
method can take an options list now as wellclasses.get_enum
method to return the instance of a Java enum itemclasses.new_instance
method to create new instance of Java classtypeconv.jstring_list_to_string_list
method to convert a java.util.List
containing strings into a Python listtypeconv.jdouble_to_float
method to convert a java.lang.Double
to a Python floattypeconv
renamed methods: string_array_to_list
to jstring_array_to_list
,
string_list_to_array
to string_list_to_jarray
, double_matrix_to_ndarray
to jdouble_matrix_to_ndarray
,
enumeration_to_list
to jenumeration_to_list
, double_to_float
to float_to_jfloat
weka.timeseries
module that wraps the timeseriesForecasting
Weka packageweka.core.systeminfo
module for obtaining output from weka.core.SystemInfo
system_info
parameter to weka.core.jvm.start()
methodAttributeSelectedClassifier
meta-classifier to module weka.classifiers
AttributeSelection
meta-filter to module weka.filters
class_index
parameter to weka.core.converters.load_any_file
and weka.core.converters.Loader.load_file
, which allows specifying of
index while loading it (first
, second
, third
, last-2
, last-1
,
last
or 1-based index).append
and clear
methods to weka.filters.MultiFilter
and
weka.classifiers.MultipleClassifiersCombiner
to make adding of
filters/classifiers easier.attribute_names()
method to weka.core.dataset.Instances
classsubset
method to weka.core.dataset.Instances
class, which returns
a subset of columns and/or rows.list_property_names
to weka.core.classes
module to allow listing of Bean property names
(which are used by GridSearch
and MultiSearch
) for a Java object.suggest_package
to the weka.core.packages
module for suggesting packages for partial class
names/package names (NNge
or .ft.
) or exact class names (weka.classifiers.meta.StackingC
)JavaObject.new_instance
method now suggests packages (if possible) in case the instantiation fails
due to package not installed or JVM not started with package supporttrain_test_split
of the weka.dataset.Instances
class now creates a copy of itself before
applying randomization, to avoid changing the order of data for subsequent calls.create_instances_from_matrices
from module weka.core.dataset
now works with pure numeric data againpww-associator
, pww-attsel
, pww-classifier
, pww-clusterer
, pww-datagenerator
, pww-filter
serialize
, deserialize
methods to weka.classifiers.Classifier
to simplify loading/saving modelserialize
, deserialize
methods to weka.clusterers.Clusterer
to simplify loading/saving modelserialize
, deserialize
methods to weka.filters.Filter
to simplify loading/saving filterplot_rocs
and plot_prcs
to weka.plot.classifiers
module to plot ROC/PRC curve on same dataset
for multiple classifiersplot_classifier_errors
of weka.plot.classifiers
module now allows plotting predictions of multiple
classifiers by providing a dictionarycreate_instances_from_matrices
from module weka.core.dataset
now allows string and bytes as wellcreate_instances_from_lists
from module weka.core.dataset
now allows string and bytes as wellAssociationRuleProducer
(package weka.associations
): AssociationRules
, AssociationRule
, item
to_source
method to weka.classifiers.Classifier
and weka.filters.Filter
(underlying Java classes must implement the respective Sourcable
interface)weka.core.jvm
to avoid global setting global logging
setup to DEBUG
(thanks to https://github.com/Arnie97)weka.jar
now included in PyPi packageweka.classifiers.Evaluation
:
cumulative_margin_distribution
, sf_prior_entropy
, sf_scheme_entropy
weka.core.ClassHelper
Java class for obtaining classes and static fields, as
javabridge only uses the system class loadercheck_for_modified_class_attribute
method to FilterClassifier
classcomplete_classname
method to weka.core.classes
module, which allows
completion of partial classnames like .J48
to weka.classifiers.trees.J48
if there is a unique match; JavaObject.new_instance
and JavaObject.check_type
now make use of this functionality, allowing for instantiations like
Classifier(cls=".J48")
jvm.start(system_cp=True)
no longer fails with a KeyError: 'CLASSPATH'
if
there is no CLASSPATH
environment variable definedmtl.jar
, core.jar
and arpack_combined_all.jar
were added as is
to the weka.jar
in the 3.9.1 release instead of adding their content to it.
Repackaged weka.jar
to fix this issue (https://github.com/fracpete/python-weka-wrapper3/issues/5)typeconv.double_matrix_to_ndarray
no longer assumes a square matrix
(https://github.com/fracpete/python-weka-wrapper3/issues/4)len(Instances)
now returns the number of rows in the dataset (module weka.core.dataset
)insert_attribute
to the Instances
classcreate_relational
to the Attribute
classplot_learning_curve
method of module weka.plot.classifiers
now accepts a list of test sets;
*
is index of test set in label template stringmissing_value()
methods to weka.core.dataset
module and Instance
classy
for convenience method create_instances_from_lists
in module
weka.core.dataset
is now optionalcreate_instances_from_matrices
to weka.core.dataset
module to easily create
an Instances
object from numpy matrices (x and y)Version | Tag | Published |
---|---|---|
0.2.12 | 4mos ago | |
0.2.11 | 6mos ago | |
0.2.10 | 9mos ago | |
0.2.9 | 1yr ago |