Skip to content

Commit 615b690

Browse files
kliteynkuba-moo
authored andcommitted
net/mlx5: HWS, fix simple rules rehash error flow
Moving rules from matcher to matcher should not fail. However, if it does fail due to various reasons, the error flow should allow the kernel to continue functioning (albeit with broken steering rules) instead of going into series of soft lock-ups or some other problematic behaviour. This patch fixes the error flow for moving simple rules: - If new rule creation fails before it was even enqeued, do not poll for completion - If TIMEOUT happened while moving the rule, no point trying to poll for completions for other rules. Something is broken, completion won't come, just abort the rehash sequence. - If some other completion with error received, don't give up. Continue handling rest of the rules to minimize the damage. - Make sure that the first error code that was received will be actually returned to the caller instead of replacing it with the generic error code. All the aforementioned issues stem from the same bad error flow, so no point fixing them one by one and leaving partially broken code - fixing them in one patch. Fixes: ef94799 ("net/mlx5: HWS, rework rehash loop") Signed-off-by: Yevgeny Kliteynik <[email protected]> Reviewed-by: Vlad Dogaru <[email protected]> Signed-off-by: Mark Bloch <[email protected]> Link: https://patch.msgid.link/[email protected] Signed-off-by: Jakub Kicinski <[email protected]>
1 parent 2462c1b commit 615b690

1 file changed

Lines changed: 43 additions & 18 deletions

File tree

  • drivers/net/ethernet/mellanox/mlx5/core/steering/hws

drivers/net/ethernet/mellanox/mlx5/core/steering/hws/bwc.c

Lines changed: 43 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -74,9 +74,9 @@ static void hws_bwc_matcher_init_attr(struct mlx5hws_bwc_matcher *bwc_matcher,
7474
static int
7575
hws_bwc_matcher_move_all_simple(struct mlx5hws_bwc_matcher *bwc_matcher)
7676
{
77-
bool move_error = false, poll_error = false, drain_error = false;
7877
struct mlx5hws_context *ctx = bwc_matcher->matcher->tbl->ctx;
7978
struct mlx5hws_matcher *matcher = bwc_matcher->matcher;
79+
int drain_error = 0, move_error = 0, poll_error = 0;
8080
u16 bwc_queues = mlx5hws_bwc_queues(ctx);
8181
struct mlx5hws_rule_attr rule_attr;
8282
struct mlx5hws_bwc_rule *bwc_rule;
@@ -99,23 +99,35 @@ hws_bwc_matcher_move_all_simple(struct mlx5hws_bwc_matcher *bwc_matcher)
9999
ret = mlx5hws_matcher_resize_rule_move(matcher,
100100
bwc_rule->rule,
101101
&rule_attr);
102-
if (unlikely(ret && !move_error)) {
103-
mlx5hws_err(ctx,
104-
"Moving BWC rule: move failed (%d), attempting to move rest of the rules\n",
105-
ret);
106-
move_error = true;
102+
if (unlikely(ret)) {
103+
if (!move_error) {
104+
mlx5hws_err(ctx,
105+
"Moving BWC rule: move failed (%d), attempting to move rest of the rules\n",
106+
ret);
107+
move_error = ret;
108+
}
109+
/* Rule wasn't queued, no need to poll */
110+
continue;
107111
}
108112

109113
pending_rules++;
110114
ret = mlx5hws_bwc_queue_poll(ctx,
111115
rule_attr.queue_id,
112116
&pending_rules,
113117
false);
114-
if (unlikely(ret && !poll_error)) {
115-
mlx5hws_err(ctx,
116-
"Moving BWC rule: poll failed (%d), attempting to move rest of the rules\n",
117-
ret);
118-
poll_error = true;
118+
if (unlikely(ret)) {
119+
if (ret == -ETIMEDOUT) {
120+
mlx5hws_err(ctx,
121+
"Moving BWC rule: timeout polling for completions (%d), aborting rehash\n",
122+
ret);
123+
return ret;
124+
}
125+
if (!poll_error) {
126+
mlx5hws_err(ctx,
127+
"Moving BWC rule: polling for completions failed (%d), attempting to move rest of the rules\n",
128+
ret);
129+
poll_error = ret;
130+
}
119131
}
120132
}
121133

@@ -126,17 +138,30 @@ hws_bwc_matcher_move_all_simple(struct mlx5hws_bwc_matcher *bwc_matcher)
126138
rule_attr.queue_id,
127139
&pending_rules,
128140
true);
129-
if (unlikely(ret && !drain_error)) {
130-
mlx5hws_err(ctx,
131-
"Moving BWC rule: drain failed (%d), attempting to move rest of the rules\n",
132-
ret);
133-
drain_error = true;
141+
if (unlikely(ret)) {
142+
if (ret == -ETIMEDOUT) {
143+
mlx5hws_err(ctx,
144+
"Moving bwc rule: timeout draining completions (%d), aborting rehash\n",
145+
ret);
146+
return ret;
147+
}
148+
if (!drain_error) {
149+
mlx5hws_err(ctx,
150+
"Moving bwc rule: drain failed (%d), attempting to move rest of the rules\n",
151+
ret);
152+
drain_error = ret;
153+
}
134154
}
135155
}
136156
}
137157

138-
if (move_error || poll_error || drain_error)
139-
ret = -EINVAL;
158+
/* Return the first error that happened */
159+
if (unlikely(move_error))
160+
return move_error;
161+
if (unlikely(poll_error))
162+
return poll_error;
163+
if (unlikely(drain_error))
164+
return drain_error;
140165

141166
return ret;
142167
}

0 commit comments

Comments
 (0)