Stays on your device. WebGPU (with WebAssembly fallback) runs the model in your browser. No upload, no signup, unlimited use.
Powered by SlimSAM 77 (Apache 2.0). Distilled Segment Anything model from facebookresearch. Apache 2.0. Encoder runs once per image; clicks are instant after that. First run downloads ~22 MB quantized, cached for next time.
Click any object. Cut it out. Download as a transparent PNG.
The PocketWebTools subject cutout uses an in-browser Segment Anything model to extract any object you click on. Unlike a background remover (which picks the whole foreground for you), this one puts the choice in your hands: tap a person to keep just that person, tap a product to lift it off a busy table, tap a pet to get the dog without the leash, the couch, or the friend next to it. Refine the mask with extra include or exclude points until it matches what you wanted, then download as a transparent PNG, rotate, flip, or crop it tight before saving.
Everything runs locally on your device using WebGPU (or WebAssembly on older hardware). No upload, no signup, no per-image cost, no usage cap. Your image never touches our servers because we don't have any in the picture.
How to use it
- Drop, paste, or click to upload an image (PNG, JPEG, WebP, or AVIF).
- Wait a second or two while the model encodes your image (one-time, per image). After this, every click is instant.
- Click on the object you want to extract. The mask appears immediately. Click again on a missed area to refine, or right-click (or switch to Exclude mode) to tell SAM what NOT to include.
- Pick your output: Subject (most common, gets you a transparent PNG of just the thing you clicked), Sticker (your cutout with a thick white border and soft drop shadow, ready for iMessage/Telegram/WhatsApp), Background (everything except your selection), or Mask (a black-and-white alpha mask for use in other editors).
- Optionally rotate, flip, or crop to the subject's bounding box, then download.
When this beats a background remover
- Group photos where you only want one person, not the whole row.
- Product photography on a busy surface where the background remover keeps everything in the foreground (the product AND the desk it's sitting on).
- Removing a single distracting object from a wider scene (a parked car, a stray hand, a misplaced cup).
- Pulling a graphic element out of a screenshot (a logo, a UI button, a single chart on a slide).
- Anything with multiple subjects where the cutout you want is one of several and the decision can't be made without your intent.
How clicks become a mask
Segment Anything's encoder builds a feature map of your image once. After that, every click becomes a tiny pair of tensors (positions and labels) that the decoder turns into a mask in under 50ms. That's why the first action takes a moment but every click after it feels instant. Add more positive points to expand the selection, more negative points to contract it, and SAM iteratively refines until the mask matches your intent.
The model picks the best of three candidate masks per click based on its own confidence (the IoU score, shown next to the point count). When that score is low, the result is usually ambiguous; adding one more point almost always fixes it.
Private by design, free forever
Cloud cutout tools require you to upload your image to a server you don't control, and many of them charge per cutout or gate the feature behind a paid plan. We take the opposite path: the model runs in your tab, your image never touches our servers, and we don't have an inference bill to pay, so the tool stays free and unlimited.
Frequently asked questions
- How is this different from the background remover?
- The background remover picks the whole foreground for you (one model, binary output). The subject cutout lets YOU choose what to extract by clicking on it. If your image has three people and you only want one, the background remover keeps all three; the cutout tool keeps just the person you click. Different shape, different use cases.
- Does my image get uploaded anywhere?
- No. The image stays in your browser the entire time. The Segment Anything model runs locally on your device using WebGPU (or WebAssembly on devices without WebGPU). There's no server in the loop, no upload, no logs.
- How do I tell it what I want?
- Click anywhere on the object you want to extract. SAM produces a mask in milliseconds. If the mask covers too much (it grabs a person AND the chair they're sitting on), right-click on the part you don't want (or switch to Exclude mode and tap) to refine. Add as many positive and negative points as you need until the mask matches your intent. Each click also produces three candidate masks (e.g. 'just the person', 'person and chair', 'whole foreground'), shown as numbered chips with a confidence score — switch to a different candidate if the auto-picked one isn't what you meant.
- What's the small lag the first time I drop an image?
- The model has to encode your image once before clicks become instant. The encoder is the expensive part of Segment Anything; it produces an embedding the decoder then uses for every click. On WebGPU this takes about 1 to 3 seconds; on WebAssembly (mostly iOS) it can take 5 to 15 seconds. After that, every click is sub-50ms because the decoder is tiny.
- Why is there a 1024-pixel cap on input size?
- SAM's image encoder reshapes inputs to 1024×1024 internally regardless of source size, so anything larger is wasted work plus extra memory. Capping at 1024 long edge before encoding keeps memory low (important on iPhones) without losing any mask quality. Output mask resolution matches the source you provide, up to that cap.
- What do the four output modes do?
- Subject keeps only what's inside the mask, with everything else transparent (the most common case). Sticker wraps your cutout in a thick white border with a soft drop shadow, like an iMessage or Telegram sticker, and downloads as a transparent PNG you can drop into chat or layer in any editor. Background keeps everything outside the mask transparent in the masked area (useful when you want to delete one object from a photo). Mask exports the binary alpha mask as a black-and-white PNG (useful for image editors like Photoshop or Affinity that want a mask layer).
- Can I make stickers from my cutouts?
- Yes. Switch the output to Sticker after you've selected your subject. A thick white border traces the cutout and a soft drop shadow sits behind it, the same look used by iMessage, Telegram, and WhatsApp stickers. The thickness slider runs from 1 to 20 pixels, so you can go from a thin outline to a chunky sticker frame. The preview updates as you drag, and the result is a transparent PNG you can drop into a chat, paste onto a poster, or layer over any background.
- Can I rotate or flip the cutout before downloading?
- Yes. Use the rotate-left, rotate-right, flip-horizontal, and flip-vertical buttons under 'Rotate and flip' on the controls panel. Transforms apply to the final PNG, not to your source image. You can stack them.
- What does 'Crop to subject' do?
- Trims transparent margins to the bounding box of the mask. Only meaningful for the Subject output mode. If you click a small object in a large image, this option gives you a tight crop instead of a big mostly-transparent PNG.
- Which model is this using?
- SlimSAM-77, a distilled variant of Meta's Segment Anything Model. Apache 2.0 licensed, about 22 MB once downloaded, and engineered to run in browsers via WebGPU and WebAssembly. We picked it for the size/quality balance: it's small enough to ship to phones, and the canonical Hugging Face transformers.js example uses it for the same workflow.
- Does this work on iPhone?
- Yes. SlimSAM is small enough to run cleanly on iOS Safari via WebAssembly. The encoder takes longer than on desktop with WebGPU, but the model fits well within iOS's per-tab memory budget, so the tab won't be killed mid-encode like with heavier vision models.
- Can I use the result commercially?
- Yes. The output is yours, just like with any image editor. SlimSAM is Apache 2.0 (permissive open source); SAM-derived weights inherit that license. We claim no rights over images you process here.
- What's the right workflow for compositing the cutout into a new background?
- Download the Subject PNG, then drop it into Figma, Canva, Photoshop, or any layer-based editor on top of your new background. The transparent edges blend cleanly with whatever's behind. For e-commerce product shots, downloading with 'Crop to subject' on gives you a tight crop ready to drop into a product grid.
Related tools
- Background remover: pull the whole foreground out automatically when there's just one subject.
- Image upscaler: sharpen and enlarge small images up to 4x with AI.
- Image to text (OCR): extract text from screenshots, photos, and scans in 100+ languages.