Skip to content

Use reports for logging errors#768

Open
martosaur wants to merge 1 commit into
elixir-ecto:masterfrom
martosaur:am-report-logging
Open

Use reports for logging errors#768
martosaur wants to merge 1 commit into
elixir-ecto:masterfrom
martosaur:am-report-logging

Conversation

@martosaur
Copy link
Copy Markdown

This PR proposes to log error using report format instead of simple string. The change should be transparent, except for those who potentially relied on log message structure.

More Context

One of the more common experiences when working with Ecto in production is whenever there is a problem with DB, there is a influx of errors that all have the same root cause, but different stacktraces and, more importantly, different message. For example:

Postgrex.Protocol (#PID<0.2021.0> ("db_conn_4")) failed to connect: ** (Postgrex.Error) FATAL 53300 (too_many_connections) remaining connection slots are reserved for roles with the SUPERUSER attribute

This one is coming from a simple Logger.error call which interpolates pid, connection name and formatted underlying error. This provides a good UX in dev, but isn't great for structured logging solutions and error tracking systems used in production, as it makes it harder to group errors.

One way to address this is to use reports for logging, which allows for passing structured data to logger handlers while still defining a preferred way to format report. Logging libraries then can choose what to do with a report.

I don't think report logging is currently common in Elixir, so for me the feedback for this PR is important for this direction as a whole. If it's officially blessed, then it might be a good candidate for inclusion into library guidelines?

Copy link
Copy Markdown
Member

@josevalim josevalim left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR. Don't we already have app and module/function in the metadata? So do we need to add it to the message too?

@martosaur
Copy link
Copy Markdown
Author

Ah yes, that's fair. Although I don't think it's added if Logger.bare_log is used?

What I think I was trying to go for here is to mimic OTP's label key, which is not exactly a module-function tuple: label: {:gen_server, :terminate}. But label is sort of reserved key, as translator drops all events with it that are of unknown shape.

I think ideally we'd want to have some convention for log identifiers. For example:

  def handshake_shutdown(timeout, pid, sock) do
    if Process.alive?(pid) do
      Logger.error(
        %{
          log_id: {Postgrex, :handshake_timeout},
          pid: pid,
          timeout: timeout
        },
        report_cb: fn %{source: {module, _}} = report ->
          {"~ts (~tw) timed out because it was handshaking for longer than ~twms",
           [inspect(module), report.pid, report.timeout]}
        end
      )

      :gen_tcp.shutdown(sock, :read_write)
    end

With this, users (including other libraries authors) can identify log message by id that is theoretically more stable than module-function pair. Not sure about the name tho. log_id or lid seems ok?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants