INFO
Find here all information about working with columns in Datameer.
Renaming Columns
To rename a column:
- Right-click a column name and select "Rename" or double-click a column name. The column name is marked as editable.
INFO: To cancel column renaming, press the 'Escape' key.
- Enter the new column name.
INFO: Column names must only contain standard, capital oder lower-case characters, numbers and/ or underscores. A column name cannot begin with a number.
INFO: Column names in all workbook sheet types are case-sensitive, e.g. the column names 'Foo', 'fOO' etc. are unique columns within the same worksheet.
- Confirm the entry with "Enter" key. Renaming is finished.
Adding Columns
INFO
A column can only be added next to a column which has content. Added columns appear to the left of the outgoing column.
To add a column right-click a column name and select "Add Column". The new, empty column appears to the left. Adding a column is finished.
Reordering Columns
Resizing Columns
Splitting Columns
Encoding Columns
INFO
Column encoding performs ordinal, one-hot or binned encoding on column data which assigns a unique numeric value to each categorical or continuous value. Once applied, column encoding can be updated as needed until the desired results are achieved.
Columns with a high cardinality are not suited for ordinal/ binned encoding.
Column encoding provides a consistent view of prepared data which is especially helpful for teams working together on model building and testing activities.
Ordinal Encoding
For ordinal encoding:
- Right-click on a column header and select "Encoding" or click on the "Encode Column" icon from the toolbar. The 'Encode Column' dialog is displayed on the right.
or - If needed, change the column by entering the required column name in 'Column'.
- Select the encoding type "Ordinal Encoding" from the drop-down. Further selection options adapt to the needs.
- Decide how to deal with unknown values by clicking the required statement.
INFO: 'Drop Value' ignores values beyond the first 100 most frequent. 'Default Value' shows values, which can not be encoded.
- View the top 32 values (by count).
- If needed, add a new value in the blank field, change the order of the top values or delete single values.
- Confirm with "Encode". The encoding result is displayed in a new encoding sheet within the workbook. Ordinal Encoding is finished.
One-hot Encoding
For one-hot encoding:
- Right-click on a column header and select "Encoding" or click on the "Encode Column" icon from the toolbar. The 'Encode Column' dialog is displayed on the right.
or - If needed, change the column by entering the required column name in 'Column'.
- Select the encoding type "1-Hot Encoding" from the drop-down. Further selection options adapt to the needs.
- Decide how to deal with unknown values by clicking the required statement.
INFO: 'Drop Value' ignores values beyond the first 100 most frequent. 'Include at last column' adds a new element to the list that encodes together all values beyond the 100 most frequent. - Select the output format from the dropdown 'Output'.
INFO: 'As List' keeps all binary pairs together in a single column. 'As Column' creates binary pairs each in their own column. - View the top 32 values (by count).
- If needed, add a new value in the blank field, change the order of the top values or delete single values.
- Confirm with "Encode". The encoding result is displayed in a new encoding sheet within the workbook. One-hot encoding is finished.
Binned Encoding
For binned encoding:
- Right-click on a column header and select "Encoding" or click on the "Encode Column" icon from the toolbar. The 'Encode Column' dialog is displayed on the right.
or - If needed, change the column by entering the required column name in 'Column'.
- Select the encoding type "Binned Encoding" from the drop-down. Further selection options adapt to the needs.
- Select the output format from the dropdown 'Output'.
INFO: 'As List' keeps all binary pairs together in a single column. 'Ordinal' encodes as ordinal numbers in a single column.
- View the default value distributions.
INFO: The graph changes according to the amount of dividers.
- Enter the required bin dividers.
INFO: There are 3 deviders set as default, e.g. 'Devider 1' contains the 25% of the selected column values. - INFO: To delete a divider, click on "x" next to the required divider.
- If needed, add a new divider in the field 'Add new Divider'.
INFO: The percentage of the other dividers adjusts with each new divider.
- Confirm with "Encode". The encoding result is displayed in a new encoding sheet within the workbook. Binned encoding is finished.