act.qc.QCFilter.add_gesd_test

QCFilter.add_gesd_test(var_name, outliers=5, alpha=0.05, test_meaning=None, test_assessment='Indeterminate', test_number=None, flag_value=False, prepend_text=None)

Method to perform generalized Extreme Studentized Deviate test to detect one or more outliers in a univariate data set that follows an approximately normal distribution. Default is to find 5 outliers but can overestimate number of outliers and will only flag values determined to be outliers. If set to find one outlier is the Grubbs test.

The library used to perform test does not accept NaN values. The NaN values will be filtered out prior to testing and outlier values will be matched after. This can cause the test to run slower on large data sets.

Parameters:
  • var_name (str) – Data variable name.

  • outliers (int or float) – Number of outliers to test for. If set to 1 is the Grubbs test. If set to float values less than one will calcualte the number of outliers to test for. Float value from 0 to 0.9 will be multiplied by the number of data values to determine number of outliers to check. If set to value larger than 0.9 will use 0.9.

  • alpha (float) – Significance level for a hypothesis test

  • test_meaning (str) – Optional text description to add to flag_meanings describing the test. Will use a default if not set.

  • test_assessment (str) – Optional single word describing the assessment of the test. Will use a default if not set.

  • test_number (int) – Optional test number to use. If not set will use next available test number.

  • flag_value (boolean) – Indicates that the tests are stored as integers not bit packed values in quality control variable.

  • prepend_text (str) – Optional text to prepend to the test meaning. Example is indicate what institution added the test.

Returns:

test_info (tuple) – A tuple containing test information including var_name, qc variable name, test_number, test_meaning, test_assessment