Google Lighthouse stands out as one of the most efficient methods to gamify and encourage enhancement of web page performance for developers. Utilizing Lighthouse, we have the ability to evaluate web pages in terms of overall effectiveness, accessibility, SEO, and Google’s preferred “best practices” – all at the press of a button.
These examinations can be utilized to assess default performance for front-end frameworks or to showcase efficiency improvements achieved through meticulous reorganization. Displaying your impeccable Lighthouse ratings on social media is undeniably gratifying. It’s an esteemed accomplishment worthy of a vibrant celebration.
The simple fact that Lighthouse incites conversations among developers like us concerning performance is a triumph. However, without disregarding the importance of the matter, web performance is considerably more intricate. In this piece, we will explore the methodology behind Google Lighthouse’s performance evaluations, and, leveraging this knowledge, endeavor to manipulate those outcomes to our advantage, all in the spirit of enjoyment and exploration – as ultimately, Lighthouse acts as a reliable yet rough guide for diagnosing performance issues. Let’s have some amusement with it and observe how far we can outsmart Lighthouse into presenting improved scores beyond what we may justifiably receive.
But before that, let’s delve into data.
Field data is essential
Local performance evaluations provide valuable insights into whether your website’s performance is moving in a positive direction, yet they do not provide a complete representation of reality. The internet is a realm of boundless diversity, and collectively, we have likely lost track of the myriad of devices, internet speeds, screen dimensions, browsers, and browser versions utilized by individuals accessing websites – all of which can impact page performance and user interaction.
Field data – amassed in large quantities – gathered by an Application Performance Monitoring tool like Sentry from real users accessing your website on their devices will furnish a much more precise evaluation of your website’s performance compared to lab data obtained from a single sample using a high-powered developer machine under controlled circumstances. Data from the HTTP Archive revealed by Philip Walton in 2021 highlighted that “nearly half of all pages receiving a 100 score on Lighthouse did not meet the recommended Core Web Vitals thresholds.”
Web performance extends beyond a solitary core web vital metric or Lighthouse performance rating. Our discourse transcends the fundamental data types we are working with.
Web performance encompasses more than statistics
Velocity often emerges as the primary focus in discussions on web performance – how swiftly a page loads. While this is an important factor to consider, we must acknowledge that speed is heavily influenced by business objectives and sales targets. A report released by Google in 2018 indicated that the likelihood of bounce rates escalating by 32% grows once page loading time surpasses three seconds, and spikes to 123% for loading times exceeding ten seconds. Hence, it is necessary to accelerate page loads to enhance sales conversion rates. The core intention behind reducing bounce rates is to expedite page loading.
But what exactly does “load faster” signify? There’s a threshold beyond which we are physically unable to accelerate a web page’s loading speed. Humans – and the servers linking them – are dispersed globally, and contemporary internet infrastructure can only transmit a limited amount of data at any given time.
The crux of the matter is that page loading represents not a singular moment in time. In an article entitled “What is speed?” Google elucidates that a page loading incident is:
an encounter that no single metric can entirely encapsulate. Numerous instances during the loading process can influence whether a user perceives it as ‘rapid,’ and by solely focusing on one, you might overlook negative occurrences occurring during the remaining time span.
The pivotal term here is experience. Genuine web performance transcends metrics and velocity, focusing more on how we perceive page loading and page operability as users. This effortlessly leads us into a conversation regarding how Google Lighthouse computes performance ratings. (It’s notably less about sheer speed than one might assume.)
How is the Google Lighthouse performance score determined?
The computation of the Google Lighthouse performance score involves a weighted amalgamation of scores derived from core web vital metrics (e.g., First Contentful Paint (FCP), Largest Contentful Paint (LCP), Cumulative Layout Shift (CLS)) and other speed-related metrics (e.g., Speed Index (SI) and Total Blocking Time (TBT)) that are evident throughout the page loading timeline.
Below is the breakdown of how these metrics are weighted within the overall score:
Metric | Weighting (%) |
---|---|
Total Blocking Time (TBT) | 30 |
Cumulative Layout Shift (CLS) | 25 |
Largest Contentful Paint (LCP) | 25 |
First Contentful Paint (FCP) | 10 |
Speed Index (SI) | 10 |
The allocation of weight to each metric offers insights into how Google prioritizes the various elements contributing to a superior user experience:
1. A web page should respond promptly to user input
The primary weighted metric is Total Blocking Time (TBT), a gauge that examines the duration after the First Contentful Paint (FCP) to identify instances where the main thread might be obstructed for a prolonged period, hindering swift responses to user commands. The main thread is considered “blocked” whenever a JavaScript task occurs on the main thread for more than 50ms. Reducing TBT guarantees that a website is responsive to tangible user actions (e.g., keyboard inputs, mouse clicks, etc.).
2. A web page should display relevant content without unanticipated visual fluctuations
The subsequent heavily weighted Lighthouse metrics encompass Largest Contentful Paint (LCP) and Cumulative Layout Shift (CLS). LCP signifies the juncture in the loading process when the primary content of the page has presumably loaded and is hence valuable.
Upon the presumed completion of primary content loading, maintaining visual consistency is crucial to ensure seamless user engagement and to prevent unexpected visual disruptions.are unaffected by sudden visual movements (CLS). An ideal LCP measurement is any duration below 2.5 seconds (which is significantly greater than what we might anticipate, considering our constant effort to optimize our websites to be swift as possible).
3. Loading Content on a Web Page
The First Contentful Paint (FCP) metric signifies the initial moment in the process of page loading where the user can discern content on the screen, while the Speed Index (SI) gauges how swiftly content is visually presented during the loading process until the page reaches a state of “completion”.
Your page’s evaluation is based on the speed measurements of actual websites utilizing performance statistics from the HTTP Archive. A commendable FCP metric is under 1.8 seconds, and a commendable SI metric is under 3.4 seconds. These benchmarks are higher than one might expect when pondering about speed.
Preference for Usability over Pure Speed
Google Lighthouse’s assessment of performance undoubtedly emphasizes usability rather than sheer speed. Your SI and FCP could exhibit exceptional speed, but if your LCP experience a delay in rendering, causing CLS due to elaborate images or external content leading to visual shifts, then your overall performance assessment will be inferior compared to a scenario where your page slightly lags in displaying FCP but avoids causing any CLS. In essence, if JavaScript impedes the responsiveness of the page by obstructing the main thread for more than 50ms, the performance evaluation will be more adversely affected than if the page slightly delays showing FCP.
To gain a better comprehension of how each metric’s weightage influences the final performance evaluation, you can experiment with the sliding controls on the Lighthouse Scoring Calculator. Here is a simplistic table illustrating the impact of skewed emphasis on individual metric weightings on the total performance evaluation, evidencing the priority given to page usability and responsiveness over sheer speed.
Description | FCP (ms) | SI (ms) | LCP (ms) | TBT (ms) | CLS | LH perf Score |
---|---|---|---|---|---|---|
Delay in displaying content on the screen | 6000 | 0 | 0 | 0 | 0 | 90 |
Delay in loading content over time | 0 | 5000 | 0 | 0 | 0 | 90 |
Delay in loading the primary segment of the page | 0 | 0 | 6000 | 0 | 0 | 76 |
Visual alterations during page loading | 0 | 0 | 0 | 0 | 0.82 | 76 |
Page’s unresponsiveness to user input | 0 | 0 | 0 | 2000 | 0 | 70 |
The overall Google Lighthouse performance score is computed by converting each raw metric value into a score ranging from 0 to 100 based on its placement within the Lighthouse scoring distribution, which originates from the performance metrics of real website performance data from the HTTP Archive. Two significant takeaways emerge from this mathematically intensive data:
- Your Lighthouse performance score is contextualized against genuine website performance data, not in isolation.
- Given the utilization of a log-normal distribution in scoring, the correlation between individual metric values and the total score is non-linear, indicating that while significant enhancements can be made to low-performing scores, improving an already high score becomes more challenging.
Explore further about how metric scores are ascertained, featuring a graphical representation of the log-normal distribution curve on developer.chrome.com.
Is It Possible to Outsmart Google Lighthouse?
I admire Google’s emphasis on usability rather than sheer speed in the discourse on web performance. It encourages developers to prioritize creating genuine experiences over pursuing arbitrary metrics. Nonetheless, I’ve pondered if in today’s era of 2024, it’s viable to deceive Google Lighthouse into perceiving a poorly designed and unhelpful page as outstanding.
I donned my lab coat and scientific goggles to embark on an investigation. All tests were carried out:
- Employing the Chromium Lighthouse plugin,
- In an incognito window using the Arc browser,
- Choosing the “navigation” and “mobile” configurations (except where specified otherwise),
- By me, within a laboratory setting (i.e., no real-world data).
Despite the controlled environment of my experiments contradicting the advice at the beginning of this post, the experiment proved to be an intriguing endeavor. My hope is for you to glean that Lighthouse scores are merely one element—albeit a minor one—of an intricate web performance puzzle. Furthermore, lacking real-world data, the relevance of these findings could be questioned.
Strategies to Manipulate FCP and LCP Scores
TL;DR: exhibit the minimal LCP-qualifying content upon loading to elevate the FCP and LCP scores until the completion of the Lighthouse evaluation.
FCP designates the initial phase in the page loading process where the user can perceive any content on the screen, while LCP signifies the point in the loading timeline when the primary page content (e.g., the largest text or image element) has likely loaded. An expeditious LCP contributes to reinforcing the user’s confidence in the page’s benefits. “Likely” and “beneficial” are key terms to bear in mind here.
Identifying LCP Elements
The categories of elements on a webpage considered by Lighthouse for LCP are:
<img>
tags<image>
tags within an<svg>
tag<video>
tags- An element with a background image loaded through the
url()
function (excluding CSS gradients) - Block-level elements containing text nodes or inline-level text elements
The ensuing elements are omitted from LCP consideration due to the presumption that they lack useful content:
- Elements with zero transparency (invisible to users),
- Elements eclipsing the entire viewport (likely background elements), and
- Placeholder images or low-information content images with marginal entropy (e.g., solid-colored images).
However, the concept of judging an image or text element as beneficial is highly subjective in this scenario and generally falls beyond the scope of what machine algorithms can reliably determine. For instance, a page I constructedcontaining solely a <h1>
tag, which, once 10 seconds have passed, will have more descriptive content inserted into the DOM by JavaScript and then conceal the <h1>
tag.
According to the experiment, Lighthouse views the heading tag as the largest contentful paint (LCP) component. Upon reaching this stage, the page’s loading timeline finishes, yet the primary content is not fully loaded. Despite this, Lighthouse assumes that the content has loaded within the 10-second window with a high probability. Surprisingly, Lighthouse still gives a full score of 100 even if the heading is replaced with a simple punctuation like a period, which is even less valuable.
This trial indicates that when loading page content via client-side JavaScript, it may be better to refrain from showing a skeleton loader screen. This is because displaying such a screen requires loading additional elements on the page. By recognizing that this process will take some time, and by transferring the network request from the main thread to a web worker to avoid impacting the total blocking time, we can use a generic “splash screen” containing a minimum viable LCP component for better first contentful paint (FCP) scoring. This approach gives the impression to Lighthouse that the page is user-friendly much quicker than it actually is.
All that is required is to include a legitimate LCP component that represents the FCP. While I would not recommend loading the main page content via client-side JavaScript in 2024 (favor serving static HTML from a content delivery network or building as much content on the server as possible), using this “hack” is also not recommended for a positive user experience, regardless of the Lighthouse performance score. Additionally, this method may not be beneficial for search engine indexing since the search bots are unable to locate the main content when it is absent from the DOM.
I performed a similar trial with various random images posing as the LCP to further diminish the utility of the page. However, given the use of smaller file sizes — further reduced and converted to “next-gen” image formats through a third-party image API to enhance page loading speed — Lighthouse perceived these images as either “placeholder images” or images with “low entropy”. Consequently, these images were not recognized as LCP components, a desirable outcome that makes the LCP less susceptible to manipulation.
Check out the demo page using Chromium DevTools in incognito mode to witness the outcomes firsthand.
However, this strategy might not be as effective in many other scenarios. As an illustration, Discord employs the “splash screen” method when performing a hard refresh of its application in the browser, resulting in a disappointing performance score of 29.
As opposed to my DOM-injected demonstration, the LCP component was identified as some content concealed behind the splash screen rather than the elements within the splash screen itself, likely due to the presence of one or more large images in the focused text channel I tested. It could be argued that Lighthouse scores hold less significance for applications that are accessible only after authentication, as they do not need to be indexed by search engines.
In many other situations where apps serve user-generated content, it may be challenging to completely control the LCP component, especially regarding images.
For instance, if you do have control over the sizes of all images on your web pages, you could leverage an intriguing hack or “optimization” (in quotation marks) to potentially manipulate the system. This was demonstrated by RentPath in 2021, wherein developers were able to boost their Lighthouse performance score by 17 points by enlarging image thumbnails on a web page. They managed to influence Lighthouse into considering the LCP component as one of the larger thumbnails instead of a Google Maps tile on the page, which took significantly longer to load via JavaScript.
The key takeaway is that achieving higher Lighthouse performance scores is feasible if you are mindful of your LCP element and have control over it, whether through a hack similar to RentPath’s or one of your own devising, or through genuine enhancements. While I characterized the splash screen approach as a hack in this article, it does not imply that this type of experience cannot provide a meaningful and enjoyable user experience. Both performance and user experience revolve around comprehending the loading process and intention.
Strategies for Manipulating CLS Scores
Quick Summary: Defer loading content causing layout shifts until Lighthouse test is likely completed under the assumption that it has enough data. CSS animations using transform do not typically induce CLS, except when combined with adding new elements to the DOM.
CLS is gauged on a decimal scale, with a score below 0.1 considered good and a score above 0.25 deemed poor. Lighthouse calculates CLS based on the most significant series of unexpected layout shifts occurring during a user’s session, factoring in viewport size and movement of unstable elements between two rendered frames. While individual instances of layout shift may be inconsequential, a series of shifts one after another will detrimentally affect your score.
If you are aware that your page triggers irritating layout shifts upon loading, you can postpone them until after the page load is complete, tricking Lighthouse into assuming there are no CLS issues. In a personal example, a demo page returns a CLS score of 0.143, even as newly added text elements cause an upward shift in the original content promptly via JavaScript. By pausing the script that adds new nodes to the DOM for a random five seconds using a setTimeout()
function, Lighthouse fails to detect the CLS occurring.
Another demo page achieves a perfect performance score of 100, despite being potentially less practical and user-friendly than the previous page, as the additional elements appear arbitrarily without any user interaction.
Although deferring layout shifts for a page load test is feasible, this workaround is not effective for field data or long-term user experience, which are more critical factors, as previously discussed. In conducting a “time span” test using Lighthouse on the page with deferred layout shifts, it accurately reports a non-satisfactory CLS score of around 0.186.
If you intentionally aim to create a chaotic user experience akin to the demo, you can leverage CSS animations and transforms to strategically bring content into view on the page. In Google’s CLS guide, it mentions that “content transitioning smoothly from one position to another can enhance user comprehension and guide them through state changes,” emphasizing once again the importance of contextual user experience.
On the next demo page, I utilize CSS transformations with scale()
to animate the text elements from 0
to 1
and reposition them on the page. These transformations do not trigger CLS because the text nodes are already present in the DOMWhen the document is ready. I noticed during my experimentation that if the text elements are dynamically inserted into the DOM after the page has loaded using JavaScript and then animated, Lighthouse will detect CLS and evaluate accordingly.
You cannot manipulate a Speed Index score
The Speed Index score is calculated based on the visual progression of the page as it is rendered. The faster your content loads early in the page loading sequence, the more favorable the outcome.
One can attempt to employ certain strategies to deceive the Speed Index into perceiving a slower loading sequence. Conversely, there is no feasible method to artificially expedite the loading of content. The sole approach to enhancing your Speed Index score is by refining your webpage to load as much content as possible promptly. While this approach may not be entirely practical in the present web scene (primarily as it would jeopardize the roles of designers), you could fully commit to diminishing your Speed Index by:
- Serving static HTML web pages directly from a CDN,
- Omitting images from the page,
- Reducing or removing CSS, and
- Halting the loading of JavaScript or any external dependencies.
You also can’t truly manipulate a TBT score
TBT gauges the total duration after the FCP during which the main thread was obstructed by JavaScript activities sufficiently to hinder responses to user interactions. An ideal TBT score is under 200ms.
Web applications heavy on JavaScript (like single-page applications) that execute intricate state computations and DOM adjustments on the client-side during page loading (rather than on the server before transmitting rendered HTML) are inclined to attain poor TBT scores. In such scenarios, one could conceivably alter their TBT score by postponing all JavaScript until after the Lighthouse evaluation concludes. Nonetheless, one would need to provide placeholder content or a loading screen to cater to FCP and LCP requirements and to notify users of impending activities. Additionally, efforts would be necessary to navigate around the front-end framework in use. (Avoid loading a placeholder page that eventually loads a distinct React application at some indeterminate moment within the page loading sequence!)
An intriguing aspect is that while advanced JavaScript functionalities are still prevalent in client-side activities, innovations in the current web environment are facilitating a collective effort to mitigate the occurrence of subpar TBT scores. Numerous front-end frameworks, coupled with contemporary hosting services, possess the capability to render pages and execute complex logic on demand without any client-side JavaScript. While eliminating JavaScript from the client-side is not the objective, there exist myriad possibilities to reduce its deployment, thereby diminishing the risk of excessive computational tasks on the main thread during page loading.
In summary: Lighthouse remains simply an approximate guide
Google Lighthouse cannot pinpoint every flaw in a specific website. Although Lighthouse’s performance metrics prioritize page usability in terms of responsiveness to user actions, it does not comprehensively identify all usability or accessibility issues in 2024.
In 2019, Manuel Matuzović conducted an experiment by intentionally fabricating a poorly designed webpage that Lighthouse perceived positively. I speculated that Lighthouse’s performance would improve after a duration of five years; however, this was not the case.
In this final demonstration page, CSS and JavaScript incapacitate input events, rendering the page unresponsive to user actions. After five seconds, JavaScript changes the behavior, permitting interaction with the button. Surprisingly, the page maintains a perfect score for both performance and accessibility.
Relying solely on Lighthouse as a replacement for usability assessments and logical reasoning is not advisable.
Additional whimsical exploits
In all aspects of life, there exists a strategy to manipulate the system. Presented below are some proven and guaranteed tactics to artificially boost your Lighthouse performance score, surpassing all others:
- Conduct Lighthouse evaluations exclusively on high-speed and top-tier hardware.
- Ensure your internet connection is optimal; relocate if required.
- Utilize lab data exclusively without incorporating real-world usage data, collected using the aforementioned high-speed hardware and exceptional internet connection.
- Repeat the evaluations under varying lab conditions utilizing all the ingenious hacks detailed in this article until achieving the desired outcome to impress your acquaintances, coworkers, and internet strangers.
Important: To gain insights into web performance and elevate your website optimization skills, consider adopting an application monitoring tool like Sentry. Regard Lighthouse as a preliminary indicator and Sentry as the ultimate production-data-capturing, efficient, web vitals apparatus.
To conclude, here is the link to the complete demo website for educational purposes.