Evaluation & Experiments

class pynlpl.evaluation.AbstractExperiment(inputdata=None, **parameters)
defaultparameters()
delete()
done(warn=True)

Is the subprocess done?

duration()
run()
sample(size)

Return a sample of the input data

score()
start()

Start as a detached subprocess, immediately returning execution to caller.

startcommand(command, cwd, stdout, stderr, *arguments, **parameters)
wait()
class pynlpl.evaluation.ClassEvaluation(goals=[], observations=[], missing={}, encoding='utf-8')
accuracy(cls=None)
append(goal, observation)
auc(cls=None, macro=False)
compute()
confusionmatrix(casesensitive=True)
fp_rate(cls=None, macro=False)
fscore(cls=None, beta=1, macro=False)
outputmetrics()
precision(cls=None, macro=False)
recall(cls=None, macro=False)
specificity(cls=None, macro=False)
tp_rate(cls=None, macro=False)
class pynlpl.evaluation.ConfusionMatrix(tokens=None, casesensitive=True, dovalidation=True)

Confusion Matrix

class pynlpl.evaluation.ExperimentPool(size)
append(experiment)
poll(haltonerror=True)
run(haltonerror=True)
start(experiment)
class pynlpl.evaluation.OrdinalEvaluation(goals=[], observations=[], missing={}, encoding='utf-8')
compute()
mae(cls=None)
rmse(cls=None)
class pynlpl.evaluation.ParamSearch(experimentclass, inputdata, parameterscope, poolsize=1, constraintfunc=None, delete=True)

A simpler version of ParamSearch without Wrapped Progressive Sampling

exception pynlpl.evaluation.ProcessFailed
class pynlpl.evaluation.WPSParamSearch(experimentclass, inputdata, size, parameterscope, poolsize=1, sizefunc=None, prunefunc=None, constraintfunc=None, delete=True)

ParamSearch with support for Wrapped Progressive Sampling

searchbest()
test(i=None)
pynlpl.evaluation.auc(x, y, reorder=False)

Compute Area Under the Curve (AUC) using the trapezoidal rule

This is a general fuction, given points on a curve. For computing the area under the ROC-curve, see auc_score().

Parameters:
  • x (array, shape = [n]) – x coordinates.
  • y (array, shape = [n]) – y coordinates.
  • reorder (boolean, optional (default=False)) – If True, assume that the curve is ascending in the case of ties, as for an ROC curve. If the curve is non-ascending, the result will be wrong.
Returns:

auc

Return type:

float

Examples

>>> import numpy as np
>>> from sklearn import metrics
>>> y = np.array([1, 1, 2, 2])
>>> pred = np.array([0.1, 0.4, 0.35, 0.8])
>>> fpr, tpr, thresholds = metrics.roc_curve(y, pred, pos_label=2)
>>> metrics.auc(fpr, tpr)
0.75

See also

auc_score()
Computes the area under the ROC curve
pynlpl.evaluation.filesampler(files, testsetsize=0.1, devsetsize=0, trainsetsize=0, outputdir='', encoding='utf-8')

Extract a training set, test set and optimally a development set from one file, or multiple interdependent files (such as a parallel corpus). It is assumed each line contains one instance (such as a word or sentence for example).

pynlpl.evaluation.mae(absolute_error_values)
pynlpl.evaluation.rmse(squared_error_values)