Bagazhnikpro — Find Product Cards Without a Photo
Purpose
Enumerate every product in the bagazhnikpro.store catalog whose listing card has no product photo (the card renders the grey "Нет Изображения" / no_image_ru.png placeholder instead of a real image) and export the result as an Excel workbook. For each no-photo product the skill returns its internal id, SKU (артикул), title, category, and canonical product URL. Read-only — it only reads public catalog pages; it never logs in, edits, or touches the cart.
When to Use
- A merchandising / content audit: "which products are missing a photo so we can shoot/upload images?"
- Producing a hand-off spreadsheet (
.xlsx) of incomplete product cards for a catalog manager. - Any periodic data-quality check over the full bagazhnikpro.store catalog (~25.8k products across ~50 leaf categories).
- As a template for the sibling shops on the same engine (e.g.
autossport.store), which share the identical "hotengine" CMS markup.
Workflow
bagazhnikpro.store runs the "hotengine" CMS (LiteSpeed/CyberPanel). The optimal method is pure HTTP fetch — no browser, no proxy, no stealth needed (the homepage anti-bot probe reported none detected, and ClaudeBot/GPTBot are explicitly Allowed in robots.txt). The one non-obvious trick: a bare category URL (/ru/pers_shop/<cat>/) returns a JS-only skeleton with zero products to a plain fetch — but appending the pagination query params ?n=<page>&NUM_ONPAGE=<count> makes the server render the full product grid as static HTML. That single insight turns the whole task into deterministic HTML parsing.
-
Collect the leaf category slugs. Fetch the homepage
https://bagazhnikpro.store/ru/(note:https://bagazhnikpro.store/301-redirects to/ru/) and harvest every link matching/ru/pers_shop/<slug>/. ~68 such links exist, but many are parent groupings that render a subcategory list and 0 products (e.g.bagajniki,deflektori,kovriki,lampy,farkopi) — that's fine, they simply yield no rows. The leaf categories (e.g.poperechininareylingi,zashitakarteradvigatelya,organayzery,deflektorinaoknavetroviki) are where products live. -
Page through each category server-side. For each category slug, fetch:
https://bagazhnikpro.store/ru/pers_shop/<slug>/?n=<N>&NUM_ONPAGE=96starting at
N=0and incrementing until a page returns fewer thanNUM_ONPAGEproduct blocks (or zero). Each page is ~430 KB atNUM_ONPAGE=96. Do not crankNUM_ONPAGEarbitrarily high to grab a whole big category in one shot — large categories (e.g.bagajnikinakrishyavtomobilyavsborehas 5,856 products,deflektorinaoknavetroviki3,203,porogi2,637) will blow past the fetch body-size limit. 96/page is a safe sweet spot. -
Parse product cards. Each card is a
<div class="hotengine-shop-product-list-block" ...>carrying:data-hotengine-marking-shop_catalog_page_id="<id>"→ product iddata-hotengine-marking-shop_catalog_page_sku="<sku>"→ SKU / артикул- title in
.hotengine-shop-product-title > h4 > a(also in the anchortitle=attr) - the listing image
<img class="hotengine-shop-products-list-img" ...>
-
Detect "no photo" — the definitive signal. A card with a photo has
<img class="hotengine-shop-products-list-img" src="/upload/shop_catalog/s9060/<id>/<id>_0.small_cachedx300.{webp|jpg}">. A card without a photo has an extra class and a fixed placeholder src:<img class="hotengine-shop-products-list-img hotengine-shop-products-list-img-empty" loading="lazy" src="/img/shop/no_image_ru.png" />So a product is "no photo" iff either its listing
<img>carries the classhotengine-shop-products-list-img-emptyor itssrcis/img/shop/no_image_ru.png(equivalently: thesrcdoes not start with/upload/shop_catalog/). The CSS selectordiv.hotengine-shop-product-list-block img.hotengine-shop-products-list-img-emptymatches exactly the no-photo cards. -
Collect & de-duplicate by product id (a product can appear under more than one category path), decode HTML entities in the title/SKU (
",&, ,«/»), and build the canonical URLhttps://bagazhnikpro.store/ru/pers_shop/<slug>/<id>.htm. -
Write the Excel file. Emit one row per no-photo product with columns
№ | ID товара | Артикул (SKU) | Название | Категория | Ссылкаto an.xlsx(e.g. via thexlsx/SheetJS library, or any spreadsheet writer). Over the full catalog this run produced 43 no-photo products out of 25,817 (≈0.17%).
Browser fallback
If the ?n=&NUM_ONPAGE= server-render ever stops working, drive a browser instead: browse open "<category>?n=0&NUM_ONPAGE=48", browse wait load, then query the DOM directly — document.querySelectorAll("div.hotengine-shop-product-list-block") and flag any whose img.hotengine-shop-products-list-img-empty exists (or whose img.src ends in no_image_ru.png). A bare (non-verified, non-proxied) Browserbase session is sufficient; this site needs no stealth. This path is ~100× slower/costlier than fetch and only needed as a safety net.
Site-Specific Gotchas
- Bare category pages are empty to a fetch.
GET /ru/pers_shop/<slug>/(no query string) returns the page chrome with 0 product blocks — the grid is hydrated client-side. You must add?n=0&NUM_ONPAGE=<n>to get server-rendered products. This is the single thing that traps naive scrapers. /redirects to/ru/. The root issues a301to/ru/. The shop is multilingual (/ru/,/en/,/ua/,/pl/); the placeholder filename is locale-specific — Russian uses/img/shop/no_image_ru.png. On other locales expectno_image_<lang>.png; match on the-emptyclass (locale-agnostic) rather than the exact filename if you crawl non-RU trees.- Two reliable no-photo signals, prefer the class.
img.hotengine-shop-products-list-img-empty(added only to placeholder images) is the most robust signal;src="/img/shop/no_image_ru.png"is the secondary one. Photo cards carry the class without the-emptysuffix and a/upload/shop_catalog/...src. (Caution when regex-parsing: matching the literalclass="hotengine-shop-products-list-img"with a closing quote will miss photo cards' true class only if you assume it's unique — the-emptycards have a second class token after a space.) - Parent vs. leaf categories. Of ~68
/ru/pers_shop/<slug>/links, roughly a third are parent groupings that render a subcategory list and zero products. They're harmless (0 rows) but don't mistake "0 products" for "0 no-photo products" on those — you simply haven't reached the leaves. - De-dupe by id. The same product id can surface in multiple category paths; key your result set on
data-hotengine-marking-shop_catalog_page_idto avoid double-counting. - Watch the body-size cap on big categories.
browse cloud fetch(and the sitemap athttps://autossport.store/sitemap.xml.gzreferenced in robots.txt) will 502 with "response body exceeded the maximum allowed size" on very large responses. KeepNUM_ONPAGE≤ ~96 and paginate. - No anti-bot, no proxy required. Verified: plain
browse cloud fetch(no--proxies, no--verified) returns full server-rendered product HTML; the autobrowse cross-check used a bare remote session. Don't waste time/money on stealth or residential proxies here. - Correlation, not a rule: in this snapshot all 43 no-photo products were also "Нет в наличии" (out of stock), and their embedded schema.org JSON-LD showed
availability: OutOfStock. Out-of-stock is not a substitute signal for "no photo" — always test the image, not the stock state. robots.txtCrawl-delay: 1 — keep to ≤ 1 request/second to stay polite; the full catalog crawl is a few hundred fetches.
Expected Output
An .xlsx workbook (sheet "Товары без фото") with one row per no-photo product. The underlying data shape:
{
"site": "bagazhnikpro.store",
"scanned_products": 25817,
"no_photo_count": 43,
"detection_signal": "img.hotengine-shop-products-list-img-empty / src=/img/shop/no_image_ru.png",
"no_photo_products": [
{
"id": "326370",
"sku": "PSV 123323",
"title": "Набор автомобилиста \"Аварийный\" (огнетушитель, знак, аптечка )",
"category": "organayzery",
"url": "https://bagazhnikpro.store/ru/pers_shop/organayzery/326370.htm"
},
{
"id": "99322",
"sku": "м.00230",
"title": "Защита MOTODOR радиатора BMW 6 (e63) 2003-2007 куп",
"category": "zashitakarteradvigatelya",
"url": "https://bagazhnikpro.store/ru/pers_shop/zashitakarteradvigatelya/99322.htm"
},
{
"id": "327185",
"sku": "Rola 59504",
"title": "Багажник-корзина ROLA на поперечены 121,9x95,2x10,2 см",
"category": "korziniiplatformi",
"url": "https://bagazhnikpro.store/ru/pers_shop/korziniiplatformi/327185.htm"
}
]
}
Excel column layout (one row per product):
| № | ID товара | Артикул (SKU) | Название | Категория | Ссылка |
|---|---|---|---|---|---|
| 1 | 326370 | PSV 123323 | Набор автомобилиста "Аварийный" … | organayzery | https://bagazhnikpro.store/ru/pers_shop/organayzery/326370.htm |
If a run finds no missing photos in the scanned scope, return no_photo_count: 0 with an empty no_photo_products array and still emit an (empty-bodied) workbook with the header row.