Don't clobber the user provided samplerate (specified by input module
options). This allows to re-determine the samplerate from a potentially
changed file on disk upon reload.
The list of previously created channels is kept across file reloads. We
also need to keep (or re-create) the list of channels that are used for
datafeed submission of analog data.
Release more allocated resources in the .cleanup() routine, and do reset
internal state such that a .reset() thus .cleanup() then .receive() call
sequence will work. This code path is taken for re-import of files (see
bug #1241, CSV was affected).
Slightly unobfuscate the "end of current input chunk" marker in the data
processing loop. Make the variable's identifier reflect that it's not a
temporary, but instead something worth keeping around until needed again.
Unbreak the calculation of line numbers in those situations where input
chunks (including previously accumulated unprocessed data) happens to
start with a line termination. This covers input files which start with
empty lines, as well as environments with mutli-byte line termination
sequences (CR/LF) and arbitrary distribution of bytes across chunks.
This fixes bug #968.
Accept when there is no line termination in the current input chunk. We
cannot assume that calling applications always provide file content in
large enough chunks to span complete lines. And any arbitrary chunk size
which applications happen to use can get exceeded by input files (e.g.
for generated files with wide data or long comments).
Mention the required synchronization of default option values and format
match logic in a prominent location where options are discussed.
Update the TODO list for the CSV input module. Mixed signal handling got
fixed by rearranging channel creation. Samplerate can get determined
from timestamp columns already. The double data type for analog data
remains.
The previous implementation incompletely handled arbitrary data type
oders in mixed signal input files. Rearrange the logic such that all
format specs get parsed first, then all channel creation details get
determined, then all channels get created. It appears to be essential
that all logic channels get created first, resulting in index numbers
starting at 0 and addressing the correct position in bitfields. The
analog channels get created after all logic channels exist. Adjacent
number ranges for channel types also results in more readable logic in
other locations.
This was tested with -I csv:column_formats=t,2a,l,a,x4,a,-,b3 example
data, and works as expected.
Implement .format_match() support in the CSV input module. Simple
multi-column files will automatically load without an "-I csv" spec.
Non-default option values still do require the module selection before
options can get passed to the module.
Try to balance a compact format and completeness/accuracy of content for
builtin help texts for the CSV input module's options. Assume a technical
audience (this is signal analysis software after all).
Rename a few internal identifiers which help organize the list of options
and their help texts. Too short names became obscure.
Accept 't' format specs for timestamp columns. Automatically derive the
samplerate from input data when timestamps are available and the user
did not provide a rate. Stick with a simple approach for robustness
since automatic detection is easy to override when it fails. This
feature is mostly about convenience.
The previous implementation open coded type checks by comparing an enum
value against specific constants. Which was especially ugly since there
are multiple types which all are logic, and future column types neither
get ignored nor have channels associated with them.
Improve readability and robustness by adding helpers which check classes
(ignore, logic, analog) instead of the multitude of -/x/o/b|l/a variants
within the classes.
Also comment on the order of channel creation, and how to improve it.
Expand the developer comment that's inline in the source file. There is
enough room to explain things. Try to come up with a "one-line" help text
that is precise yet compact, and does inform the user of available format
choices including modifiers. Chances are that longer descriptions start
reducing the usefulness of the help or the visibility of options when
users are in a hurry. Those who care can access the manual.
Mark more options as obsolete, and mention more default values in the
builtin help text. Also tweak a comment on getting channel names from
header lines.
Support for mixed signal CSV input data is desirable and should be
possible. The current implementation just happens to not fully cope with
arbitrary mixes of data types in columns yet. Add a quick workaround,
but also a TODO item to properly address the topic later.
Extend the CSV input module which was strictly limited to logic data so
far. Add support for analog data types. Implement the 'a' column format,
and feed analog data to the session bus.
This implementation feeds data of individual analog channels to the
session bus in separate packets each. This approach was found to work
most reliably, not all recipients support the submission of multiple
samples for multiple channels in a single packet.
A fixed 'digits' value is used. This needs to get addressed later.
Local experiments suggest that the 'double' data type for analog data
can result in erroneous visual presentation (observed with sigrok-cli).
Use 'float' for now, until the issue is understood and got fixed.
Support for double is prepared internally and is easily enabled.
Address several minor nits. Eliminate unneeded variables. Update text to
number conversion comments including wildcard handling. Remove empty
lines in init() which used to spill out a set of lines which all do the
same thing (evaluate a set of options) and shall belong together.
Move the creation of logic channels to the location where formats fields
get iterated, and column processing details get derived. This reduces a
lot of redundancy, and simplifies the addition of more data formats.
Update the list of TODO items at the top of the CSV input module's
source. Text line handling (counting line numbers) got fixed. Adding
support for analog channels was prepared, as are timestamp columns.
Rename the CSV input module's option keywords. To better reflect their
purpose, and for consistency across the rather complex set of options
and how they interact. Rearrange the list of options (not that the order
matters from the outside, but it's good to have during maintenance).
Update builtin help texts which will show up in applications, as well as
the source code comments which discuss these options in greater detail.
Would be nice to have a "general" help text for input modules which is
not tied to one single option, to provide an overview or use examples.
Arrange the option keys, short and long help texts such that the source
better reflects the applications' screen layout. To better support
future maintenance, again.
Consistently separate multi-work keywords for improved readability.
Prefer underscores over dashes for consistency with common keys in
shared infrastructure in other project sources (device options, MQ
items, etc).
Extend the "column-formats" option support in the CSV input module to
also support wildcards and automatic channel count detection. Move the
format string interpretation to the location where the first data line
or the optional header line are seen. Map the simple options (single
column number and channel count, or first column number and optional
channel count) to a format string, to unify internal code paths. Remove
code paths for the previous specific yet limited scenarios.
Rephrase the condition which keeps executing the "initial receive"
phase. The text line termination sequence gets derived from the first
complete text line, but other essential information is only gathered
later, potentially after skipping a large (user specified) amount of
input data. Keep checking for this essential setup data until data or
the header actually were seen, before regular processing of input data
starts.
Extend the CSV input module, introduce support for the "column-formats="
option. This syntax can express the previous single- and multi-column
semantics, as well as any arbitrary order of to-get-ignored, and single-
and multi-bit columns in several formats.
The previous "simple" keywords for single and multi column modes still
are in place, it's yet to get determined whether to axe them. Depends on
whether users can handle the format strings for these simple cases.
The previous implementation allowed CSV input files to use any line
termination in either CR only, LF only, or CR/LF format. The first EOL
was searched for and was recorded, but then was not used. Instead any of
CR or LF were considered a line termination. "Raw data processing" still
was correct, but line numbers in diagnostics were way off, and optional
features like skipping first N lines were not effective. Fix that.
Source code inspection suggests the "startline" feature did not work at
all. The user provided number was used in the initial optional search
for the header line (to get signal names) or auto-determination of the
number of columns. But then was not used when the file actually got
processed.
Reduce "state" in the CSV input module's context. Stick with variables
that are local to routines when knowledge of details need not be global.
Really base the processing of a column's input text on the column's
processing information which was gathered in the setup phase.
Rename few identifiers, to explicitly refer to logic channels (the only
currently supported data type of the CSV input module). Cease feeding
logic data to the session bus when there are no logic channels at all
(currently not really an option). Prepare for simpler dispatching of
parse routines should more data types get added in a future version.
Reduce some "clutter" (overly fragmented stuff that should go together
since it forms logical groups and is not really standalone). Address a
few more minor style nits (sizeof() redundancy, "seemingly inverse"
string comparison phrases).
Improve the code paths which determine logic channels' names from an
optional CSV file header line. Strip optional quotes from the column's
input text (re-use a SCPI helper routine for that). Also use the channel
name for multi-bit fields, append [0] etc suffixes in that case. Comment
on the manipulation of input data, which is acceptable since that very
data won't get processed another time in another code path.
Rephrase the CSV input module's implementation such that generic support
to "process a column" becomes available. All columns of an input file's
text line get inspected, a column can either get ignored, or converted
to logic data. A future version can then remove the current limitations
of single- and multi-column modes (either one single multi-bit cell, or
multiple single-bit cells which must be adjacent).
Combine the bin/oct/hex parse routines into one routine which handles up
to four bits per input number digit with common logic. Availability of
more data than channels (according to user specs) is not fatal.
Drop the counter intuitive "first-channel" option, use "first-column"
instead. Warn when comment leader and column separator are identical
(was silent before, may be unexpected). Extend diagnostics and address
minor readability nits, update comments. Rephrase logic channel name
assignment.
Use simple scalar options to derive generic processing details: Either
'single-column' and 'numchannels' are required, with an optional
'format' spec (resulting in single-column mode). Or 'first-column' with
an optional 'numchannels' (multi-column mode with fixed format, using
all available columns by default). The default is multi-column mode with
one logic channel per column and spanning all columns on a text line.
Don't clobber the value of the user provided 'header' option. Use a
separate flag to track whether the header line was seen before, or
needs to get skipped when it passes by.
Move the communication of the samplerate meta packet to the very spot
where logic sample data gets sent. This allows to optionally determine
late the samplerate, potentially from input data instead of user specs.
Move the helper routines which arrange for the data feed to an earlier
spot, so that references resolve without forward declarations. Rename
routines to reflect that they deal with logic data.
Slightly unobfuscate column text to logic data conversion, and reduce
redundancy. Move sample data preset to a central location.
Rephrase error messages, provide stronger hints as to why the input text
for a conversion was considered invalid.
The previous implementation assumed that in multi-column mode each cell
communicates exactly one bit of input (a logic channel). But only the
first character got tested. Tighten the check, to cover the whole input
text. This rejects fully invalid input, as well as increases robustness
since multi-bit input like "100" was mistaken as a value of 1 before.
Add documentation to the bin/hex/oct text parse routines, and move the
bin/hex/oct dispatcher to the location where its invoked routines are.
Stick with a TODO comment for parse_line() to reduce the diff size.
The parse_line() routine is rather complex, optionally accepts an upper
limit for the number of columns, but unconditionally assumes a first one
and drops preceeding fields. The rather generic n and k identifiers are
not helpful.
Use the 'seen' and 'taken' names instead which better reflect what's
actually happening. Remove empty lines which used to tear apart groups
of instructions which are strictly related. This organization neither
was helpful during maintenance.
Accept when comments are indented, trim the whitespace from text lines
after stripping off the comment. This avoids the processing of lines
which actually are empty, and improves robustness (avoids errors for a
non-fatal situation). Also results in more appropriate diagnostics at
higher log levels.
The CSV input module has grown so many options, that counting them by
hand became tedious and error prone. Eliminate the magic numbers in the
associated code paths.
This also has the side effect that the set is easy to re-order just by
adjusting the enum, no other code is affected. Help text and default
values is much easier to verify and adjust with the symbolic references.
[ see 'git diff --word-diff' for the essence of the change ]
Use size_t for things that get counted: column indices, channel numbers
(line numbers already used size_t). De-anonymize an enum to avoid 'int'
where it gets referenced. Adjust printf(3) format strings. Get unsigned
values from option lookups (stick with 32bits, should be acceptable for
spreadsheet columns and channel counts).
Address other minor nits while we are here: Also terminate the last item
in an enum declaration. Add a doxygen comment for parse_line(). Rename a
parameter to achieve tabular doc text layout.
Rephrase the #include statements in the CSV input module. "config" is
not a system header but is provided by the application source code.
Separate the config and system and application groups (their order is
essential). Alpha-sort the files within their group for simplified
maintenance.
Do for the CSV input module what commit 08f8421a9e did for VCD. Check
the channel list for consistency across re-imports of the same file.
This addresses the CSV part of bug #1241.
The cleanup() routine gets invoked upon shutdown, as well as before
re-importing another file. The cleanup() routine must not release
resources which get allocated in the init() routine, as the init()
routine won't run again in the module's lifetime. The cleanup() routine
must void those context fields which get evaluated in the next receive()
calls.
The previous implementation inspected the input stream's samplerate, and
simply used the next 1kHz/1MHz/1GHz timescale for VCD export. Re-import
of the exported file might suffer from rather high an overhead, where
users might have to downsample the input stream. Also exported data
might use an "odd" timescale which doesn't represent the input stream's
timing well.
Rephrase the samplerate to VCD timescale conversion such that the lowest
frequency is used which satisfies the file format's constraints as well
as provides high enough a resolution to communicate the input stream's
timing with minimal loss. Do limit this scaling support to at most three
orders above the input samplerate, to most appropriately cope with odd
rates.
As a byproduct the rephrased implementation transparently supports rates
above 1GHz. Input streams with no samplerate now result in 1s timescale
instead of the 1ms timescale of the previous implementation.
The 'period' member of the VCD output module's context is supposed to
hold frequencies that correspond to the timescale used during export.
An 'int' (in combination with VCD's 1/10/100 constraint) thus would
result in a 1GHz limit, use uint64_t instead to support higher rates.
Iterate over the received sample set first, before iterating over the
respective sample's number of channels. This avoids redundant extraction
of sampled bits (which saves only little), but also increases locality
of processed data (though string accumulation still may be expensive).
It also adds the future option of RLE compression during accumulation of
output data, which perfectly matches the WaveDrom syntax for repeated
bit patterns.
Rearrange the order of routines in the wavedrom output module. Keep the
flow of .receive() -> .process_logic() -> .wavedrom_render() in one common
group of routines, which is not disrupted by the .init() and .cleanup()
routines which are kind of boilerplate in the source file. This increases
readability and maintainability.
Adjust brace style, use C language comments, drop camel case. Use size_t
for indices and offsets. Unobfuscate the open/close logic of rendered
output. Allocate zero-filled memory, reduce sizeof() redundancy. Don't
SHOUT in the module's .name property.
[ Changes indentation, see 'git diff -w -b' for review. ]
Add a comment on the logic which skips the upper 64 bytes of a 512 bytes
chunk in the Asix Sigma's sample memory. Move the initial assignment and
the subsequent update from a value which was retrieved from a hardware
register closer together for awareness during maintenance. Pre-setting a
high position value that will never match when the feature is not in use
is very appropriate.
Adjust the sigma_read_pos() routine to handle triggerpos identically to
stoppos. The test condition's intention is to check whether a decrement
of the position ends up in the meta data section of a chunk. The previous
implementation tested whether a pointer to the position variable ended in
0x1ff when decremented -- which is unrelated to the driver's operation.
It's assumed that no harm was done because the trigger feature is
unsupported (see bug #359).
This silences the compiler warning reported in bug #1411.