Submit to this Journal Review for this Journal Propose a Special Issue

Article Menu

Open AccessArticle

Peer-Review Record

Yet Another Compact Time Series Data Representation Using CBOR Templates (YACTS)

Sensors 2023, 23(11), 5124; https://doi.org/10.3390/s23115124

by Sebastian Molina Araque¹, Ivan Martinez², Georgios Z. Papadopoulos^1,*, Nicolas Montavont¹ and Laurent Toutain¹

Reviewer 1:

Gregor Schiele

Reviewer 2: Anonymous

Sensors 2023, 23(11), 5124; https://doi.org/10.3390/s23115124

Submission received: 5 March 2023 / Revised: 23 May 2023 / Accepted: 23 May 2023 / Published: 27 May 2023

(This article belongs to the Special Issue Feature Papers in the Internet of Things Section 2023)

Round 1

Reviewer 1 Report

This is a good paper with one major shortcoming (with several parts) that needs to be fixed.

I find the related work discussion as well as the choice of comparison formats highly problematic.

While the description of JSON, CBOR etc. is well done, it misses the point. The same is true for the additional binary format discussion.

The authors seem to miss the large body of research on time series formats, e.g. from the semantic streams community, stream compression or update protocols. These should be discussed and these should be compared against. As a side note, semantic formats like RDF should be mentioned anyways, since they have been used in the IoT world for a long time (although the used formats are often not very efficient). Many ideas can also be found by looking more into compression schemes for in-memory data stores, e.g., for compressed RDF stores, as well compressed transmission schemes for RDF.

Please note: I am aware that you do not actually propose stream compression. Instead you propose a compression format for messages containing multiple values (i.e., a batch of stream elements that are send together). As such, it may be ok to not compare against stream compression schemes or update protocols… but the paper needs to reflect this better. Right now your motivation suggests that these aspects should be taken into account… but they are not.

The evaluation seems to show that your approach is much, much better than existing ones but nobody will be surprised that uncompressed data will be much bigger than compressed ones (even if it uses a rather straightforward compression scheme). One simple thing you could add to mitigate this: How does you approach compare to a standard compression? Can you compare against zipping each full-blown JSON file? How much better is your approach compared to that? I am sure it is better but please quantify this. This way, you compare your special purpose compression scheme against a general purpose compression scheme and can report how much you win by being more specific.

By the way, what happens if you cannot batch so many measurements? I understand you batch 300 measurements, right? That is useful for uploading all data after a biking tour to a server once you are back home, sure. What if you need live updates? Please discuss this and at least tell us how the results will change (e.g. do we get a linear increase in gain with the number of batched measurements? one would assume, right?).

Please check the abstract again for language errors. There are many small ones throughout it. The actual paper body is much better in that regard and actually very nicely written.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Reviewer 2 Report

The paper presents a new format for time series, which utilizes the compactness of CBOR by leveraging delta values to represent measurements, employing tags to represent variables, and utilizing templates to convert the time-serious data representation into the appropriate format for IoT applications.

The paper is generally well-written but too long. Basic concepts should be dropped. Parts of Technical Background are not necessary and some trivial figures (for example Figure 1), among others.

It is important a comparative table contains several formats' characteristics for time series. The authors must clarify the key contributions of their proposal.

What are the benefits of using their proposal instead of HTTP version 3, which minimizes the packets sent too (just the values with changes)?

The experiments and several results are interesting. However, whenever the IoT application gathers data with several changes in a short period. What's the impact of performance of their proposal (number of packets, energy)? They did not evaluate this scenario.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Round 2

Reviewer 1 Report

Thank you for your changes. I must confess that I am still confused about some parts of the paper. Some streamlining could help. You say that your approach is mainly aiming to decouple the application format from the sensor format. I do not understand that statement. Any format can be used for this and it is common practice to use a different format for sending data updates as the one used by the application to access the data? Why is this a new challenge? Interoperability has been a major issue for many years and a solution cannot be to introduce a new format. Instead, the hard part is to have everybody agree on a common vocabulary. That is why I brought up liked data and semantic web technologies, since they tried to fix this problem(with limited success) through ontologies and mapping / inference rules.

With respect to related work, it is good that you added RDF but I was also referring to compression schemes based on RDF, both for storing data and for sending it.

Similarly, although your focus might not be on interoperability, you never evaluate it or argue why your approach is actually helpful in this regard. I understand that this is difficult and as such, I can see why you instead choose to focus your evaluation on the efficiency of the data transmission. But through this, the advantages that you report are focussed on data compression and thus, I still think it is fair to ask for a trivial comparison technology like a standard compression scheme. Yes, it is not ideal and I would welcome that you discuss this in the paper (briefly) but with respect to your reported large savings, it might put things into a more realistic perspective.

Finally, you added live updates, i.e., actual online streams. However, your comparison does not mention that you need to send the context information in addition (in all but trivial cases in which a predefined standard value works, e.g. for the time of the measurement) and you seem to ignore that you need more than a format for this, you need a protocol.

Again, I am confused. Apologies if I still do not get it. However, I think many readers will share my confusion and as such I think more work is needed.

Author Response

Please see the attachment.

Author Response File: Author Response.pdf

Article Menu

Yet Another Compact Time Series Data Representation Using CBOR Templates (YACTS)

Further Information

Guidelines

MDPI Initiatives

Follow MDPI