list_available_variables()capability exposes tags along with their names & types, enabling querying of the available outputs for specific tag matches. E.g.
@extract_columnsyou should call it with an asterisk like this:
field_type-- this information is used for static compilation to ensure downstream uses are expecting the right type.
@config.whenallows you to specify different implementations depending on configuration parameters.
@config.otherwise(...)decorator, so make sure to have
config.whenspecify set of configuration possibilities. Any missing cases will not have that output column (and subsequent downstream nodes may error out if they ask for it). To make this easier, we have a few more
@config.when_not(param=value)Will be included if the parameter is not equal to the value specified.
@config.when_in(param=[value1, value2, ...])Will be included if the parameter is equal to one of the specified values.
@config.when_not_in(param=[value1, value2, ...])Will be included if the parameter is not equal to any of the specified values.
@configIf you're feeling adventurous, you can pass in a lambda function that takes in the entire configuration and resolves to
False. You probably don't want to do this.
parameterizedallows you keep your code DRY by reusing the same function to create multiple distinct outputs. The parameter key word argument has to match one of the arguments in the function. The rest of the arguments are pulled from outside the DAG. The assigned_output key word argument takes in a dictionary of tuple(Output Name, Documentation string) -> value.
parameterized_inputsallows you to keep your code DRY by reusing the same function to create multiple distinct outputs. The key word arguments passed have to have the following structure:
OUTPUT_NAME = Mapping of function argument to input that should go into it.
D_ELECTION_2016_shiftedis an output that will correspond to replacing
D_ELECTION_2016. Then similarly
SOME_OUTPUT_NAMEis an output that will correspond to replacing
SOME_INPUT_NAME. The documentation for both uses the same function doc and will replace values that are templatized with the input parameter names, and the reserved value
@doesis a decorator that essentially allows you to run a function over all the input parameters. So you can't pass any old function to
@does, instead the function passed has to take any amount of inputs and process them all in the same way.
@doesdecorator and pass it the
@doesdecorator is currently limited to just allow functions that consist only of one argument, a generic **kwargs.
@modelallows you to abstract a function that is a model. You will need to implement models that make sense for your business case. Reach out if you need examples.
GLMhere is not part of the hamilton framework, and instead a user defined model.
output_columnparameter -- this is specifically if the name of the function differs from the output column that it should represent. E.G. if you use the model result as an intermediate object, and manipulate it all later. At Stitch Fix this is necessary because various dependent columns that a model queries (e.g.
OFFSET_...) are derived from the model's name.