Status

ACCEPTED

Stakeholders
Outcome

Unable to render Jira issues macro, execution error.

Due date
Owner

Background

To make logging process consistent across Iroha applications logging process should be introduced and logging format should be defined.

Action items

Problem

Unstructured logs are hard to monitor, analyze and process automatically.

  1. Logs should be printed to stdout and stderr.
  2. Log-level (verbosity) should be changeable in runtime, example: system signal (e.g., SIGUSR1, SIGHUP) to reread configs(env variable)  and change the log-level.

Solution

Log macros can log function arguments and result, using https://tools.ietf.org/html/rfc5424 or newer.  Nowadays de facto standard for log is JSON.

LEVEL DATE_TIME function_name[start]: arguments
LEVEL DATE_TIME function_name[end]: result

Example:

DEBUG - 2020-08-04 08:09:40:759899847 - request[start]: 
DEBUG - 2020-08-04 08:09:40:759961514 - self = Client { public_key: PublicKey { digest_function: "ed25519", payload: "[1E, 0, 33, 8A, D, 96, B, B4, 4D, 9E, 7F, 3A, C1, 3C, A, 5D, 89, BF,
  31, 8A, F8, 76, E2, FD, 15, 50, EE, 28, C5, EE, 9E, 63]" }, torii_url: "127.0.0.1:45371" },

Decisions

  • Use tracing as a logging facade.
  • Provide a `/configure` endpoint for changing the log-level at run-time. The request should be provided in terms of a JSON file.
  • Each function should be instrumented to provide vital (determined on a case-by-case basis) information.
  • Errors must be either handled, or logged as `warn` or `error` level messages.
  • Use tracing_bunyan_formatter to provide efficient machine readable logging. The destination is set at run-time and log rotation is the administrator's responsibility.
  • Fields and their values should be recorded directly, preferably (unless otherwise specified), using their names in the source code.
  • Logging that is done unconditionally upon entering/exiting a function is given the TRACE level.
  • Logging that is done upon completion of some or part of a function is given a DEBUG level or higher.

Concerns

  • Implementation of loggers could affect performance (tracing is one of the fastest)
  • Log rotation can become an issue.

Assumptions

  • We do not depend on any implementation, only on format

Risks

  • Format will be inefficient for automatic monitoring `[1;8]`
  • Logging will slow down performance `[4;7]`
  • Log messages format will not help in issues discovery `[2;8]`

Additional Information

  • No labels

7 Comments

  1. As was discussed with Makoto Takemiyawe can adopt some existing library for this functionality.

  2. Hi,

    1. Logs should be printed to stdout and stderr.
    2. Log-level (verbosity) should be changeable in runtime, example: system signal (e.g., SIGUSR1, SIGHUP) to reread configs(env variable)  and change the log-level.
    3. rfc3164  is  already obsolete,  you can use https://tools.ietf.org/html/rfc5424 or newer.  Nowadays de facto standard for log is JSON not syslog.
    4. I recommend to use any already existed library for logging. 
    5. The formatting is fine, until it is the same for all logs. 
    6. Does log always one line or you may have multi-line logs (e.g stacktrace)?
    1. Thanks Bulat Saifullin, logs can contain multi-line messages for errors, all other messages are formated as one line.

      1. Nikita Puzankov , can you show here the example of such log?
        also multiline log should be easily distinguished, one solution is to start every log with number e.g date, every new line that started not from the number is the same event:
        example:

        DATE_TIME LEVEL function_name[start]: msg with stacktrace
         stacktrace
         stacktrace
         stacktrace
        


        1. These logs possible only if application crashes and was launched with `RUST_BACKTRACE=1` environment variable, so I think we have no easy way to work with it's format:

             0: backtrace::backtrace::libunwind::trace
                       at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.46/src/backtrace/libunwind.rs:86
             1: backtrace::backtrace::trace_unsynchronized
                       at /cargo/registry/src/github.com-1ecc6299db9ec823/backtrace-0.3.46/src/backtrace/mod.rs:66
             2: std::sys_common::backtrace::_print_fmt
                       at src/libstd/sys_common/backtrace.rs:78
             3: <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt
                       at src/libstd/sys_common/backtrace.rs:59
          test torii::tests::create_and_start_torii ... ok
             4: core::fmt::write
                       at src/libcore/fmt/mod.rs:1076
             5: std::io::Write::write_fmt
                       at src/libstd/io/mod.rs:1537
             6: std::sys_common::backtrace::_print
                       at src/libstd/sys_common/backtrace.rs:62
             7: std::sys_common::backtrace::print
                       at src/libstd/sys_common/backtrace.rs:49
             8: std::panicking::default_hook::{{closure}}
                       at src/libstd/panicking.rs:198
             9: std::panicking::default_hook
                       at src/libstd/panicking.rs:218
            10: std::panicking::rust_panic_with_hook
                       at src/libstd/panicking.rs:486
            11: rust_begin_unwind
                       at src/libstd/panicking.rs:388
            12: core::panicking::panic_fmt
                       at src/libcore/panicking.rs:101
            13: core::option::expect_none_failed
                       at src/libcore/option.rs:1272
            14: core::result::Result<T,E>::expect
                       at /home/humb1t/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore/result.rs:963
            15: iroha::sumeragi::tests::not_enough_peers
                       at iroha/src/sumeragi.rs:1287
            16: iroha::sumeragi::tests::not_enough_peers::{{closure}}
                       at iroha/src/sumeragi.rs:1280
            17: core::ops::function::FnOnce::call_once
                       at /home/humb1t/.rustup/toolchains/stable-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/src/libcore/ops/function.rs:232
            18: <alloc::boxed::Box<F> as core::ops::function::FnOnce<A>>::call_once
                       at /rustc/c367798cfd3817ca6ae908ce675d1d99242af148/src/liballoc/boxed.rs:1076
            19: <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
                       at /rustc/c367798cfd3817ca6ae908ce675d1d99242af148/src/libstd/panic.rs:318
            20: std::panicking::try::do_call
                       at /rustc/c367798cfd3817ca6ae908ce675d1d99242af148/src/libstd/panicking.rs:297
            21: std::panicking::try
                       at /rustc/c367798cfd3817ca6ae908ce675d1d99242af148/src/libstd/panicking.rs:274
            22: std::panic::catch_unwind
                       at /rustc/c367798cfd3817ca6ae908ce675d1d99242af148/src/libstd/panic.rs:394
            23: test::run_test_in_process
                       at src/libtest/lib.rs:541
            24: test::run_test::run_test_inner::{{closure}}
                       at src/libtest/lib.rs:450
          note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
          
          


          We do not produce such kind of logs from our application in usual cases (even when Internal Error occurs).

          Is it okay to do not touch this case?

          1. Yes, it's okay. let's just change the format, as I proposed, put the date at the beginning of the log, as you can see all strings in you example start from word (`test`, `note:` ) or space (`  0:`, `             at `), it will be easier to parse logs if real logs start from date.

            DATE_TIME LEVEL function_name[start]: arguments
            DATE_TIME LEVEL function_name[end]: result