libnvme: propagate the error to the caller if nvme_get_log_page() fails#1049
libnvme: propagate the error to the caller if nvme_get_log_page() fails#1049igaw merged 1 commit intolinux-nvme:masterfrom
Conversation
8d83112 to
a755dbe
Compare
|
This is v2 material and a WIP for the API change is here: https://github.com/nvme-experiments/libnvme/tree/libnvme2
|
|
@igaw while I agree that in general this is addressed by V2, I think that this particular change should be considered for libnvme 1.x as a temporary fix for the "failed to get discovery log: Success" error message that our customers are hitting now. I am also ok with avoiding raising the log level to NVME_ERROR if you want to make the change even less intrusive. |
|
Ah sorry, I only skimmed over the patch. Let me look closer now. |
|
|
||
| out_free_log: | ||
| free(log); | ||
| if (!errno) |
There was a problem hiding this comment.
IIRC nvme_get_log_page wont set errno, when the return value is positive. So in theory errno could be set... Leave it as it is, just saying it is not 100% safe. Did I mention that errno is stupid?
There was a problem hiding this comment.
Not sure what do you mean here. Yes, in some cases nvme_get_log_page() doesn't set errno... and this is precisely the problem. The caller sees that nvme_discovery_log() returns a NULL pointer and assumes that errno is set.
This is why the "failed to get discovery log: Success" error message appears
a755dbe to
0a15904
Compare
The nvme_get_log_page() function may return an NVMe status code on failure, leaving the errno number set to zero. The nvme_discovery_log() function wasn't propagating this error information to its caller; it would just return NULL without setting a meaningful errno value. the NULL pointer is interpreted as an error by nvme-cli, it then tries to print the string associated to the errno value, resulting in a funny output: failed to get discovery log: Success Fix this by capturing the status code returned by nvme_get_log_page(), converting it to an errno value using nvme_status_to_errno(), and setting errno before returning. Additionally, the error message has been improved to include the NVMe status code and the log level has been raised from LOG_INFO to LOG_ERR. Signed-off-by: Maurizio Lombardi <[email protected]>
0a15904 to
ec22339
Compare
|
Thanks! |
The nvme_get_log_page() function may return an NVMe status code on failure, leaving the errno number set to zero.
The nvme_discovery_log() function wasn't
propagating this error information to its caller; it would just return NULL without setting a meaningful errno value.
the NULL pointer is interpreted as an error by nvme-cli, it then tries to print the string associated to the errno value, resulting in a funny output:
failed to get discovery log: SuccessFix this by capturing the status code returned by
nvme_get_log_page(), converting it to an errno value using nvme_status_to_errno(), and setting errno before returning.
Additionally, the error message has been improved to include the NVMe status code and the log level has been raised from LOG_INFO to LOG_ERR.
@igaw Daniel, do you have any plan to fix the error handling in libnvme/nvme-cli ? I remember we had a small chat about this some time ago, maybe at ALPSS, I don't remember.
I think this needs to be fixed, even if it results in API breakages.
In general, I think that the usage of errno should be restricted to only those functions that directly call ioctl(); IMO libnvme users like nvme-cli should never use errno.
I also think that prototypes like this:
struct nvmf_discovery_log *nvmf_get_discovery_wargs(struct nvme_get_discovery_args *args)were a mistake.
I think it would have been better if it was like this:
int nvmf_get_discovery_wargs(struct nvme_get_discovery_args *args, struct nvme_discovery_log **log)So it could return an integer < 0 for errors like ENOMEM, EIO etc...
and integers > 0 for nvme commands status code.
0 for success with *log pointing to the allocated discovery log.
It would have been much easier to propagate the error code to its callers