list_available_variables()
capability exposes tags along with their names & types, enabling querying of the available outputs for specific tag matches. E.g.extract_columns
decorator.@extract_columns
you should call it with an asterisk like this:extract_fields
decorator.field_name
to field_type
-- this information is used for static compilation to ensure downstream uses are expecting the right type.@config.when
allows you to specify different implementations depending on configuration parameters.@config.when
@config.otherwise(...)
decorator, so make sure to have config.when
specify set of configuration possibilities. Any missing cases will not have that output column (and subsequent downstream nodes may error out if they ask for it). To make this easier, we have a few more @config
decorators:@config.when_not(param=value)
Will be included if the parameter is not equal to the value specified.@config.when_in(param=[value1, value2, ...])
Will be included if the parameter is equal to one of the specified values.@config.when_not_in(param=[value1, value2, ...])
Will be included if the parameter is not equal to any of the specified values.@config
If you're feeling adventurous, you can pass in a lambda function that takes in the entire configuration and resolves to True
or False
. You probably don't want to do this.parameterized
allows you keep your code DRY by reusing the same function to create multiple distinct outputs. The parameter key word argument has to match one of the arguments in the function. The rest of the arguments are pulled from outside the DAG. The assigned_output key word argument takes in a dictionary of tuple(Output Name, Documentation string) -> value.parameterized_inputs
allows you to keep your code DRY by reusing the same function to create multiple distinct outputs. The key word arguments passed have to have the following structure:OUTPUT_NAME = Mapping of function argument to input that should go into it.
D_ELECTION_2016_shifted
is an output that will correspond to replacing one_off_date
with D_ELECTION_2016
. Then similarly SOME_OUTPUT_NAME
is an output that will correspond to replacing one_off_date
with SOME_INPUT_NAME
. The documentation for both uses the same function doc and will replace values that are templatized with the input parameter names, and the reserved value output_name
.@does
is a decorator that essentially allows you to run a function over all the input parameters. So you can't pass any old function to @does
, instead the function passed has to take any amount of inputs and process them all in the same way.@does
decorator and pass it the sum_series
function. The @does
decorator is currently limited to just allow functions that consist only of one argument, a generic **kwargs.@model
allows you to abstract a function that is a model. You will need to implement models that make sense for your business case. Reach out if you need examples.@model
:GLM
here is not part of the hamilton framework, and instead a user defined model.output_column
parameter -- this is specifically if the name of the function differs from the output column that it should represent. E.G. if you use the model result as an intermediate object, and manipulate it all later. At Stitch Fix this is necessary because various dependent columns that a model queries (e.g. MULTIPLIER_...
and OFFSET_...
) are derived from the model's name.