Commit Graph

5191 Commits

Author SHA1 Message Date
Gerhard Sittig 72903e9d55 input/csv: rework user accessible options for consistency
Rename the CSV input module's option keywords. To better reflect their
purpose, and for consistency across the rather complex set of options
and how they interact. Rearrange the list of options (not that the order
matters from the outside, but it's good to have during maintenance).

Update builtin help texts which will show up in applications, as well as
the source code comments which discuss these options in greater detail.
Would be nice to have a "general" help text for input modules which is
not tied to one single option, to provide an overview or use examples.
Arrange the option keys, short and long help texts such that the source
better reflects the applications' screen layout. To better support
future maintenance, again.

Consistently separate multi-work keywords for improved readability.
Prefer underscores over dashes for consistency with common keys in
shared infrastructure in other project sources (device options, MQ
items, etc).
2019-12-21 18:20:04 +01:00
Gerhard Sittig 1a920e33fe input/csv: extend column-formats support for backwards compatibility
Extend the "column-formats" option support in the CSV input module to
also support wildcards and automatic channel count detection. Move the
format string interpretation to the location where the first data line
or the optional header line are seen. Map the simple options (single
column number and channel count, or first column number and optional
channel count) to a format string, to unify internal code paths. Remove
code paths for the previous specific yet limited scenarios.

Rephrase the condition which keeps executing the "initial receive"
phase. The text line termination sequence gets derived from the first
complete text line, but other essential information is only gathered
later, potentially after skipping a large (user specified) amount of
input data. Keep checking for this essential setup data until data or
the header actually were seen, before regular processing of input data
starts.
2019-12-21 18:20:04 +01:00
Gerhard Sittig 2142a79b53 input/csv: introduce column-formats option (flexible logic import)
Extend the CSV input module, introduce support for the "column-formats="
option. This syntax can express the previous single- and multi-column
semantics, as well as any arbitrary order of to-get-ignored, and single-
and multi-bit columns in several formats.

The previous "simple" keywords for single and multi column modes still
are in place, it's yet to get determined whether to axe them. Depends on
whether users can handle the format strings for these simple cases.
2019-12-21 18:20:04 +01:00
Gerhard Sittig ef0b9935cf input/csv: fixup input file line number handling
The previous implementation allowed CSV input files to use any line
termination in either CR only, LF only, or CR/LF format. The first EOL
was searched for and was recorded, but then was not used. Instead any of
CR or LF were considered a line termination. "Raw data processing" still
was correct, but line numbers in diagnostics were way off, and optional
features like skipping first N lines were not effective. Fix that.

Source code inspection suggests the "startline" feature did not work at
all. The user provided number was used in the initial optional search
for the header line (to get signal names) or auto-determination of the
number of columns. But then was not used when the file actually got
processed.
2019-12-21 18:20:04 +01:00
Gerhard Sittig 836fac9cf6 input/csv: unassorted adjustment, mostly "column processing" related
Reduce "state" in the CSV input module's context. Stick with variables
that are local to routines when knowledge of details need not be global.
Really base the processing of a column's input text on the column's
processing information which was gathered in the setup phase.

Rename few identifiers, to explicitly refer to logic channels (the only
currently supported data type of the CSV input module). Cease feeding
logic data to the session bus when there are no logic channels at all
(currently not really an option). Prepare for simpler dispatching of
parse routines should more data types get added in a future version.

Reduce some "clutter" (overly fragmented stuff that should go together
since it forms logical groups and is not really standalone). Address a
few more minor style nits (sizeof() redundancy, "seemingly inverse"
string comparison phrases).
2019-12-21 18:20:04 +01:00
Gerhard Sittig f6dcb3200d input/csv: improve "channel name from header line" logic
Improve the code paths which determine logic channels' names from an
optional CSV file header line. Strip optional quotes from the column's
input text (re-use a SCPI helper routine for that). Also use the channel
name for multi-bit fields, append [0] etc suffixes in that case. Comment
on the manipulation of input data, which is acceptable since that very
data won't get processed another time in another code path.
2019-12-21 18:20:04 +01:00
Gerhard Sittig e53f32d2b8 input/csv: introduce generic "column processing" support
Rephrase the CSV input module's implementation such that generic support
to "process a column" becomes available. All columns of an input file's
text line get inspected, a column can either get ignored, or converted
to logic data. A future version can then remove the current limitations
of single- and multi-column modes (either one single multi-bit cell, or
multiple single-bit cells which must be adjacent).

Combine the bin/oct/hex parse routines into one routine which handles up
to four bits per input number digit with common logic. Availability of
more data than channels (according to user specs) is not fatal.

Drop the counter intuitive "first-channel" option, use "first-column"
instead. Warn when comment leader and column separator are identical
(was silent before, may be unexpected). Extend diagnostics and address
minor readability nits, update comments. Rephrase logic channel name
assignment.

Use simple scalar options to derive generic processing details: Either
'single-column' and 'numchannels' are required, with an optional
'format' spec (resulting in single-column mode). Or 'first-column' with
an optional 'numchannels' (multi-column mode with fixed format, using
all available columns by default). The default is multi-column mode with
one logic channel per column and spanning all columns on a text line.
2019-12-21 18:20:04 +01:00
Gerhard Sittig de8fe3b515 input/csv: improve robustness of "use header for channel names"
Don't clobber the value of the user provided 'header' option. Use a
separate flag to track whether the header line was seen before, or
needs to get skipped when it passes by.
2019-12-21 18:20:04 +01:00
Gerhard Sittig 246aca5f54 input/csv: move samplerate meta packet to logic data feed submission
Move the communication of the samplerate meta packet to the very spot
where logic sample data gets sent. This allows to optionally determine
late the samplerate, potentially from input data instead of user specs.
2019-12-21 18:20:04 +01:00
Gerhard Sittig 626c388abf input/csv: rearrange text to logic data conversion and datafeed
Move the helper routines which arrange for the data feed to an earlier
spot, so that references resolve without forward declarations. Rename
routines to reflect that they deal with logic data.

Slightly unobfuscate column text to logic data conversion, and reduce
redundancy. Move sample data preset to a central location.

Rephrase error messages, provide stronger hints as to why the input text
for a conversion was considered invalid.
2019-12-21 18:20:04 +01:00
Gerhard Sittig dbc38383b2 input/csv: stricter input data test for multi column mode
The previous implementation assumed that in multi-column mode each cell
communicates exactly one bit of input (a logic channel). But only the
first character got tested. Tighten the check, to cover the whole input
text. This rejects fully invalid input, as well as increases robustness
since multi-bit input like "100" was mistaken as a value of 1 before.
2019-12-21 18:20:04 +01:00
Gerhard Sittig 19267272d3 input/csv: slightly shuffle text routines, add bin/hex/oct doc
Add documentation to the bin/hex/oct text parse routines, and move the
bin/hex/oct dispatcher to the location where its invoked routines are.
Stick with a TODO comment for parse_line() to reduce the diff size.
2019-12-21 18:20:04 +01:00
Gerhard Sittig 9eab4435f0 input/csv: unobfuscate text line to column splitting
The parse_line() routine is rather complex, optionally accepts an upper
limit for the number of columns, but unconditionally assumes a first one
and drops preceeding fields. The rather generic n and k identifiers are
not helpful.

Use the 'seen' and 'taken' names instead which better reflect what's
actually happening. Remove empty lines which used to tear apart groups
of instructions which are strictly related. This organization neither
was helpful during maintenance.
2019-12-21 18:20:04 +01:00
Gerhard Sittig b2c4dde226 input/csv: trim whitespace after eliminating comments
Accept when comments are indented, trim the whitespace from text lines
after stripping off the comment. This avoids the processing of lines
which actually are empty, and improves robustness (avoids errors for a
non-fatal situation). Also results in more appropriate diagnostics at
higher log levels.
2019-12-21 18:20:04 +01:00
Gerhard Sittig c6aa9870b4 input/csv: eliminate magic numbers in options declaration
The CSV input module has grown so many options, that counting them by
hand became tedious and error prone. Eliminate the magic numbers in the
associated code paths.

This also has the side effect that the set is easy to re-order just by
adjusting the enum, no other code is affected. Help text and default
values is much easier to verify and adjust with the symbolic references.

[ see 'git diff --word-diff' for the essence of the change ]
2019-12-21 18:20:04 +01:00
Gerhard Sittig ad6a2beec3 input/csv: data type nits (sizes, enums)
Use size_t for things that get counted: column indices, channel numbers
(line numbers already used size_t). De-anonymize an enum to avoid 'int'
where it gets referenced. Adjust printf(3) format strings. Get unsigned
values from option lookups (stick with 32bits, should be acceptable for
spreadsheet columns and channel counts).

Address other minor nits while we are here: Also terminate the last item
in an enum declaration. Add a doxygen comment for parse_line(). Rename a
parameter to achieve tabular doc text layout.
2019-12-21 18:20:04 +01:00
Gerhard Sittig