[1.15] Fix scrolling timeout issue#678
Conversation
This change addresses a potential issue where the maximum scrolling distance may change between calculating it, sending the scroll message, and verifying the new position. If verifying the position times out, the code now checks if the scrolling distance has changed. If it has, scrolling is retried once. Additionally, the maximum wait time for verifying the new position has been reduced from 30 seconds to 3 seconds.
|
By looking at your code, you are just trying to scroll again one more time if you encounter a timeout if the mouse has moved on the screen? So there is nothing preventing the issue from happening a second time, it would just be less likely to happen. |
|
@enricodias In my case, due to the overlay that opens, the maximum distance always ends up being 0. In this situation, scrolling is not repeated. If something is rendered that only reduces the maximum distance but doesn’t set it to 0, an attempt to scroll again will be made. Here is a simplified example of the code I use to load a page and scroll to the bottom: <?php
use HeadlessChromium\BrowserFactory;
use HeadlessChromium\Exception\CommunicationException;
use HeadlessChromium\Exception\CommunicationException\ResponseHasError;
use HeadlessChromium\Exception\NoResponseAvailable;
use HeadlessChromium\Exception\OperationTimedOut;
use HeadlessChromium\Page;
include __DIR__ . '/vendor/autoload.php';
class ScrollDownPage
{
/**
* @throws Exception
*/
public function invoke(Page $page): ?string
{
$distance = $this->waitAndGetMaxScrollingDistance($page);
if ($distance === 0) { // Retry once
$distance = $this->waitAndGetMaxScrollingDistance($page);
}
if ($distance > 0) {
$scrollingEvents = 0;
while ($distance > 0 && $scrollingEvents < 1_000) {
$page->mouse()->scrollDown($distance);
$distance = $this->waitAndGetMaxScrollingDistance($page);
$scrollingEvents++;
}
} else {
throw new Exception('Scrolling down failed. Couldn’t scroll down even once.');
}
return $page->getHtml();
}
/**
* @throws OperationTimedOut
* @throws CommunicationException
* @throws NoResponseAvailable
* @throws ResponseHasError
*/
private function waitAndGetMaxScrollingDistance(Page $page): int
{
$distance = $this->getMaxYScrollingDistance($page);
if ($distance > 0) {
return $distance;
}
for ($i = 1; ($i * 50_000) <= 1_000_000; $i++) {
usleep(50_000);
$distance = $this->getMaxYScrollingDistance($page);
if ($distance > 0) {
return $distance;
}
}
return 0;
}
/**
* @throws CommunicationException
* @throws CommunicationException\ResponseHasError
* @throws NoResponseAvailable
* @throws OperationTimedOut
*/
private function getMaxYScrollingDistance(Page $page): int
{
$scrollableArea = $page->getLayoutMetrics()->getCssContentSize();
$visibleArea = $page->getLayoutMetrics()->getCssVisualViewport();
$maximumY = $scrollableArea['height'] - $visibleArea['clientHeight'];
return (int) $maximumY - (int) $visibleArea['pageY'];
}
}
$url = 'https://www.example.com/something';
$browserFactory = new BrowserFactory('chromium');
$browser = $browserFactory->createBrowser(['windowSize' => [1920, 1000], 'headless' => false]);
$page = $browser->createPage();
$page->navigate($url)->waitForNavigation();
// Scroll down
$html = (new ScrollDownPage())->invoke($page);If you’d like to try it yourself, you might want to modify the private function scroll(int $distanceY, int $distanceX = 0): self
{
$this->page->assertNotClosed();
// make sure the mouse is on the screen
$this->move($this->x, $this->y);
[$distances, $targets] = $this->getScrollDistancesAndTargets($distanceY, $distanceX);
error_log('try to scroll distances: ' . var_export($distances, true));
// scroll
$this->sendScrollMessage($distances);
try {
// wait until the scroll is done
Utils::tryWithTimeout(3_000_000, $this->waitForScroll($targets['x'], $targets['y']));
} catch (\HeadlessChromium\Exception\OperationTimedOut $exception) {
error_log('failed to verify scroll distances');
// Maybe the possible max scroll distances changed in the meantime.
$prevDistances = $distances;
[$distances, $targets] = $this->getScrollDistancesAndTargets($distanceY, $distanceX);
error_log('new distances: ' . var_export($distances, true));
if ($prevDistances === $distances) {
throw $exception;
}
if (0 !== $distanceY || 0 !== $distanceX) { // Try with the new values.
$this->sendScrollMessage($distances);
// wait until the scroll is done
Utils::tryWithTimeout(3_000_000, $this->waitForScroll($targets['x'], $targets['y']));
}
}
// set new position after move
$this->x += $distances['x'];
$this->y += $distances['y'];
return $this;
}I thought that since the maximum possible scroll distance is already determined and replaces the user-requested distance before the message is sent to the browser, it wouldn’t be desirable for an exception to be thrown if it can't scroll the full distance (or no distance at all). Additionally, I found it a bit excessive to try verifying the position for 30 seconds, especially since it might often fail due to overlays - which, unfortunately, are everywhere on the internet nowadays. |
|
This issue has been automatically marked as stale because there has been no recent activity. It will be closed after 30 days if no further activity occurs. Thank you for your contributions. |
|
I'd still like to see this implemented. Can I have your opinion about my previous comment @enricodias or @GrahamCampbell ? |
|
This issue has been automatically marked as stale because there has been no recent activity. It will be closed after 30 days if no further activity occurs. Thank you for your contributions. |
|
We've been having scroll timeouts every now and then as well. @GrahamCampbell Any reason this hasn't been merged nor closed? |
|
The reason why this is delayed is because it's tough to review. I have to both be able to understand it and validate not only that it solves the problem, but that it doesn't break something else. |
That's fair enough. For now our issue has indeed been quite inconsistent so I cannot say for certain this will fix it either. I'll add more logging to our codebase to maybe get some more information, though I doubt I'll be able to get an MRE out of it. |
|
Thanks @KSneijders for chiming in here, and for offering to dig into your logs for more info on your case - much appreciated. I took some time to make the issue reproducible and found several distinct ways to trigger it: the scrollable area shrinking, an overlay locking scrolling, and a lazy-loaded image shifting the layout mid-scroll (the infinite-scroll case). Each is now a deterministic test fixture. Turns out the fix here doesn't cover all of them. It recovers the shrink and lock cases, but not the layout-shift one, where the page grows and the recomputed distance stays the same, so it still times out. I opened #727 with a more robust approach instead: rather than waiting for an exact target, it watches the scroll position and stops once it settles. Handles all three cases, with regression tests and before/after videos. Note on transparency: I used AI assistance, but this is not blindly generated code - I reviewed, verified and polished every line myself. |
This change addresses an issue where the maximum scrolling distance may change between calculating it, sending the scroll message, and verifying the new position. If verifying the position times out, the code now checks if the scrolling distance has changed. If it has, scrolling is retried once.
I encountered this issue on pages where an overlay pops up. The problem occurs randomly, approximately once every 10 executions of the same code. I suspect it happens when the overlay is rendered precisely between calculating the maximum possible scroll distance and verifying it after sending the scroll message.
This change appears to have completely resolved the issue in my specific case. However, I haven’t added a test case, as I don't know how to reliably reproduce this behavior.
I also reduced the wait time for verifying the new position, as 30 seconds seemed unnecessarily long.
I created this pull request directly without opening an issue first. If you’d prefer that I create an issue as well, please let me know.