new readme

new backend
new server frame
2023-10-29 13:57:27 +08:00 · 2023-10-29 13:54:25 +08:00 · 2023-10-22 10:59:55 +08:00 · 2023-10-22 09:33:09 +08:00 · 2023-10-22 09:23:06 +08:00 · 2023-10-22 09:15:54 +08:00
32 changed files with 143 additions and 1080 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -18,7 +18,7 @@ wheels/
 *.npy
 *.ts
 model_ts*.txt
-
+*.pth
 # onnx models
 *.onnx

@@ -35,8 +35,9 @@ model_ts*.txt
 .idea
 .vscode
 _darcs
-
+onepage.py
 # demo 
 **/node_modules
 yarn.lock
-package-lock.json
+package-lock.json
+/notebooks
--- a/README.md
+++ b/README.md
@@ -1,171 +1,4 @@
-# Segment Anything
-
-**[Meta AI Research, FAIR](https://ai.facebook.com/research/)**
-
-[Alexander Kirillov](https://alexander-kirillov.github.io/), [Eric Mintun](https://ericmintun.github.io/), [Nikhila Ravi](https://nikhilaravi.com/), [Hanzi Mao](https://hanzimao.me/), Chloe Rolland, Laura Gustafson, [Tete Xiao](https://tetexiao.com), [Spencer Whitehead](https://www.spencerwhitehead.com/), Alex Berg, Wan-Yen Lo, [Piotr Dollar](https://pdollar.github.io/), [Ross Girshick](https://www.rossgirshick.info/)
-
-[[`Paper`](https://ai.facebook.com/research/publications/segment-anything/)] [[`Project`](https://segment-anything.com/)] [[`Demo`](https://segment-anything.com/demo)] [[`Dataset`](https://segment-anything.com/dataset/index.html)] [[`Blog`](https://ai.facebook.com/blog/segment-anything-foundation-model-image-segmentation/)] [[`BibTeX`](#citing-segment-anything)]
-
-![SAM design](assets/model_diagram.png?raw=true)
-
-The **Segment Anything Model (SAM)** produces high quality object masks from input prompts such as points or boxes, and it can be used to generate masks for all objects in an image. It has been trained on a [dataset](https://segment-anything.com/dataset/index.html) of 11 million images and 1.1 billion masks, and has strong zero-shot performance on a variety of segmentation tasks.
-
-<p float="left">
-  <img src="assets/masks1.png?raw=true" width="37.25%" />
-  <img src="assets/masks2.jpg?raw=true" width="61.5%" /> 
-</p>
-
-## Installation
-
-The code requires `python>=3.8`, as well as `pytorch>=1.7` and `torchvision>=0.8`. Please follow the instructions [here](https://pytorch.org/get-started/locally/) to install both PyTorch and TorchVision dependencies. Installing both PyTorch and TorchVision with CUDA support is strongly recommended.
-
-Install Segment Anything:
-
-```
-pip install git+https://github.com/facebookresearch/segment-anything.git
-```
-
-or clone the repository locally and install with
-
-```
-git clone git@github.com:facebookresearch/segment-anything.git
-cd segment-anything; pip install -e .
-```
-
-The following optional dependencies are necessary for mask post-processing, saving masks in COCO format, the example notebooks, and exporting the model in ONNX format. `jupyter` is also required to run the example notebooks.
-
-```
-pip install opencv-python pycocotools matplotlib onnxruntime onnx
-```
-
-## <a name="GettingStarted"></a>Getting Started
-
-First download a [model checkpoint](#model-checkpoints). Then the model can be used in just a few lines to get masks from a given prompt:
-
-```
-from segment_anything import SamPredictor, sam_model_registry
-sam = sam_model_registry["<model_type>"](checkpoint="<path/to/checkpoint>")
-predictor = SamPredictor(sam)
-predictor.set_image(<your_image>)
-masks, _, _ = predictor.predict(<input_prompts>)
-```
-
-or generate masks for an entire image:
-
-```
-from segment_anything import SamAutomaticMaskGenerator, sam_model_registry
-sam = sam_model_registry["<model_type>"](checkpoint="<path/to/checkpoint>")
-mask_generator = SamAutomaticMaskGenerator(sam)
-masks = mask_generator.generate(<your_image>)
-```
-
-Additionally, masks can be generated for images from the command line:
-
-```
-python scripts/amg.py --checkpoint <path/to/checkpoint> --model-type <model_type> --input <image_or_folder> --output <path/to/output>
-```
-
-See the examples notebooks on [using SAM with prompts](/notebooks/predictor_example.ipynb) and [automatically generating masks](/notebooks/automatic_mask_generator_example.ipynb) for more details.
-
-<p float="left">
-  <img src="assets/notebook1.png?raw=true" width="49.1%" />
-  <img src="assets/notebook2.png?raw=true" width="48.9%" />
-</p>
-
-## ONNX Export
-
-SAM's lightweight mask decoder can be exported to ONNX format so that it can be run in any environment that supports ONNX runtime, such as in-browser as showcased in the [demo](https://segment-anything.com/demo). Export the model with
-
-```
-python scripts/export_onnx_model.py --checkpoint <path/to/checkpoint> --model-type <model_type> --output <path/to/output>
-```
-
-See the [example notebook](https://github.com/facebookresearch/segment-anything/blob/main/notebooks/onnx_model_example.ipynb) for details on how to combine image preprocessing via SAM's backbone with mask prediction using the ONNX model. It is recommended to use the latest stable version of PyTorch for ONNX export.
-
-### Web demo
-
-The `demo/` folder has a simple one page React app which shows how to run mask prediction with the exported ONNX model in a web browser with multithreading. Please see [`demo/README.md`](https://github.com/facebookresearch/segment-anything/blob/main/demo/README.md) for more details.
-
-## <a name="Models"></a>Model Checkpoints
-
-Three model versions of the model are available with different backbone sizes. These models can be instantiated by running
-
-```
-from segment_anything import sam_model_registry
-sam = sam_model_registry["<model_type>"](checkpoint="<path/to/checkpoint>")
-```
-
-Click the links below to download the checkpoint for the corresponding model type.
-
- **`default` or `vit_h`: [ViT-H SAM model.](https://dl.fbaipublicfiles.com/segment_anything/sam_vit_h_4b8939.pth)**
- `vit_l`: [ViT-L SAM model.](https://dl.fbaipublicfiles.com/segment_anything/sam_vit_l_0b3195.pth)
- `vit_b`: [ViT-B SAM model.](https://dl.fbaipublicfiles.com/segment_anything/sam_vit_b_01ec64.pth)
-
-## Dataset
-
-See [here](https://ai.facebook.com/datasets/segment-anything/) for an overview of the datastet. The dataset can be downloaded [here](https://ai.facebook.com/datasets/segment-anything-downloads/). By downloading the datasets you agree that you have read and accepted the terms of the SA-1B Dataset Research License.
-
-We save masks per image as a json file. It can be loaded as a dictionary in python in the below format.
-
-```python
-{
-    "image"                 : image_info,
-    "annotations"           : [annotation],
-}
-
-image_info {
-    "image_id"              : int,              # Image id
-    "width"                 : int,              # Image width
-    "height"                : int,              # Image height
-    "file_name"             : str,              # Image filename
-}
-
-annotation {
-    "id"                    : int,              # Annotation id
-    "segmentation"          : dict,             # Mask saved in COCO RLE format.
-    "bbox"                  : [x, y, w, h],     # The box around the mask, in XYWH format
-    "area"                  : int,              # The area in pixels of the mask
-    "predicted_iou"         : float,            # The model's own prediction of the mask's quality
-    "stability_score"       : float,            # A measure of the mask's quality
-    "crop_box"              : [x, y, w, h],     # The crop of the image used to generate the mask, in XYWH format
-    "point_coords"          : [[x, y]],         # The point coordinates input to the model to generate the mask
-}
-```
-
-Image ids can be found in sa_images_ids.txt which can be downloaded using the above [link](https://ai.facebook.com/datasets/segment-anything-downloads/) as well.
-
-To decode a mask in COCO RLE format into binary:
-
-```
-from pycocotools import mask as mask_utils
-mask = mask_utils.decode(annotation["segmentation"])
-```
-
-See [here](https://github.com/cocodataset/cocoapi/blob/master/PythonAPI/pycocotools/mask.py) for more instructions to manipulate masks stored in RLE format.
-
-## License
-
-The model is licensed under the [Apache 2.0 license](LICENSE).
-
-## Contributing
-
-See [contributing](CONTRIBUTING.md) and the [code of conduct](CODE_OF_CONDUCT.md).
-
-## Contributors
-
-The Segment Anything project was made possible with the help of many contributors (alphabetical):
-
-Aaron Adcock, Vaibhav Aggarwal, Morteza Behrooz, Cheng-Yang Fu, Ashley Gabriel, Ahuva Goldstand, Allen Goodman, Sumanth Gurram, Jiabo Hu, Somya Jain, Devansh Kukreja, Robert Kuo, Joshua Lane, Yanghao Li, Lilian Luong, Jitendra Malik, Mallika Malhotra, William Ngan, Omkar Parkhi, Nikhil Raina, Dirk Rowe, Neil Sejoor, Vanessa Stark, Bala Varadarajan, Bram Wasti, Zachary Winstrom
-
-## Citing Segment Anything
-
-If you use SAM or SA-1B in your research, please use the following BibTeX entry.
-
-```
-@article{kirillov2023segany,
-  title={Segment Anything},
-  author={Kirillov, Alexander and Mintun, Eric and Ravi, Nikhila and Mao, Hanzi and Rolland, Chloe and Gustafson, Laura and Xiao, Tete and Whitehead, Spencer and Berg, Alexander C. and Lo, Wan-Yen and Doll{\'a}r, Piotr and Girshick, Ross},
-  journal={arXiv:2304.02643},
-  year={2023}
-}
-```
+## to start
+```cmd
+uwsgi --http :2020 --wsgi-file backend.py --callable app 
+```
--- a/assets/masks1.png
+++ b/assets/masks1.png
--- a/assets/masks2.jpg
+++ b/assets/masks2.jpg
--- a/assets/minidemo.gif
+++ b/assets/minidemo.gif
--- a/assets/model_diagram.png
+++ b/assets/model_diagram.png
--- a/assets/notebook1.png
+++ b/assets/notebook1.png
--- a/assets/notebook2.png
+++ b/assets/notebook2.png
--- a/backend.py
+++ b/backend.py
@@ -0,0 +1,59 @@
+import base64
+from io import BytesIO
+import numpy as np
+import cv2
+from segment_anything import sam_model_registry, SamPredictor
+import torch
+import time
+from flask import Flask, Response, request
+
+# check torch version and if cuda is available
+print(torch.__version__)
+print(torch.cuda.is_available())
+# checkpoint = "sam_vit_b_01ec64.pth"
+print(time.time())
+# set large model related configs
+checkpoint = "sam_vit_h_4b8939.pth"
+model_type = "vit_h"
+sam = sam_model_registry[model_type](checkpoint=checkpoint)
+sam.to(device='cuda')
+predictor = SamPredictor(sam)
+
+app = Flask(__name__)
+
+
+def base64_to_image(base64_code):
+    img_data = base64.b64decode(base64_code)
+    img_array = np.fromstring(img_data, np.uint8)
+    print(img_array)
+    img = cv2.imdecode(img_array, -1)
+    print(img)
+    return img
+
+
+@app.route('/embedding', methods=["GET", "POST"])
+def index():
+    print('request received')
+    if (request.method != 'POST'):
+        return 'Only support POST method'
+
+    content_type = request.headers.get('Content-Type')
+    if (content_type == 'application/json'):
+        print('start calculate embedding')
+        base64data = request.get_json()['imgurl']
+        image = base64_to_image(base64data)
+        predictor.set_image(image)
+        image_embedding = predictor.get_image_embedding().cpu().numpy()
+        image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
+        byte_io = BytesIO()
+        np.save(byte_io, image_embedding)
+        byte_io.seek(0)
+        np.save("array.npy", image_embedding)
+        response = Response(byte_io, mimetype="application/octet-stream")
+        response.headers["Content-Length"] = 4194432
+        return response
+    else:
+        return 'Content-Type not supported!'
+    
+    
+print('server starts working')
--- a/demo/README.md
+++ b/demo/README.md
@@ -1,126 +0,0 @@
-## Segment Anything Simple Web demo
-
-This **front-end only** React based web demo shows how to load a fixed image and corresponding `.npy` file of the SAM image embedding, and run the SAM ONNX model in the browser using Web Assembly with mulithreading enabled by `SharedArrayBuffer`, Web Worker, and SIMD128.
-
-<img src="https://github.com/facebookresearch/segment-anything/raw/main/assets/minidemo.gif" width="500"/>
-
-## Run the app
-
-Install Yarn
-
-```
-npm install --g yarn
-```
-
-Build and run:
-
-```
-yarn && yarn start
-```
-
-Navigate to [`http://localhost:8081/`](http://localhost:8081/)
-
-Move your cursor around to see the mask prediction update in real time.
-
-## Export the image embedding
-
-In the [ONNX Model Example notebook](https://github.com/facebookresearch/segment-anything/blob/main/notebooks/onnx_model_example.ipynb) upload the image of your choice and generate and save corresponding embedding.
-
-Initialize the predictor:
-
-```python
-checkpoint = "sam_vit_h_4b8939.pth"
-model_type = "vit_h"
-sam = sam_model_registry[model_type](checkpoint=checkpoint)
-sam.to(device='cuda')
-predictor = SamPredictor(sam)
-```
-
-Set the new image and export the embedding:
-
-```
-image = cv2.imread('src/assets/dogs.jpg')
-predictor.set_image(image)
-image_embedding = predictor.get_image_embedding().cpu().numpy()
-np.save("dogs_embedding.npy", image_embedding)
-```
-
-Save the new image and embedding in `src/assets/data`.
-
-## Export the ONNX model
-
-You also need to export the quantized ONNX model from the [ONNX Model Example notebook](https://github.com/facebookresearch/segment-anything/blob/main/notebooks/onnx_model_example.ipynb).
-
-Run the cell in the notebook which saves the `sam_onnx_quantized_example.onnx` file, download it and copy it to the path `/model/sam_onnx_quantized_example.onnx`.
-
-Here is a snippet of the export/quantization code:
-
-```
-onnx_model_path = "sam_onnx_example.onnx"
-onnx_model_quantized_path = "sam_onnx_quantized_example.onnx"
-quantize_dynamic(
-    model_input=onnx_model_path,
-    model_output=onnx_model_quantized_path,
-    optimize_model=True,
-    per_channel=False,
-    reduce_range=False,
-    weight_type=QuantType.QUInt8,
-)
-```
-
-**NOTE: if you change the ONNX model by using a new checkpoint you need to also re-export the embedding.**
-
-## Update the image, embedding, model in the app
-
-Update the following file paths at the top of`App.tsx`:
-
-```py
-const IMAGE_PATH = "/assets/data/dogs.jpg";
-const IMAGE_EMBEDDING = "/assets/data/dogs_embedding.npy";
-const MODEL_DIR = "/model/sam_onnx_quantized_example.onnx";
-```
-
-## ONNX multithreading with SharedArrayBuffer
-
-To use multithreading, the appropriate headers need to be set to create a cross origin isolation state which will enable use of `SharedArrayBuffer` (see this [blog post](https://cloudblogs.microsoft.com/opensource/2021/09/02/onnx-runtime-web-running-your-machine-learning-model-in-browser/) for more details)
-
-The headers below are set in `configs/webpack/dev.js`:
-
-```js
-headers: {
-    "Cross-Origin-Opener-Policy": "same-origin",
-    "Cross-Origin-Embedder-Policy": "credentialless",
-}
-```
-
-## Structure of the app
-
-**`App.tsx`**
-
- Initializes ONNX model
- Loads image embedding and image
- Runs the ONNX model based on input prompts
-
-**`Stage.tsx`**
-
- Handles mouse move interaction to update the ONNX model prompt
-
-**`Tool.tsx`**
-
- Renders the image and the mask prediction
-
-**`helpers/maskUtils.tsx`**
-
- Conversion of ONNX model output from array to an HTMLImageElement
-
-**`helpers/onnxModelAPI.tsx`**
-
- Formats the inputs for the ONNX model
-
-**`helpers/scaleHelper.tsx`**
-
- Handles image scaling logic for SAM (longest size 1024)
-
-**`hooks/`**
-
- Handle shared state for the app
--- a/demo/configs/webpack/common.js
+++ b/demo/configs/webpack/common.js
@@ -1,84 +0,0 @@
-// Copyright (c) Meta Platforms, Inc. and affiliates.
-// All rights reserved.
-
-// This source code is licensed under the license found in the
-// LICENSE file in the root directory of this source tree.
-
-const { resolve } = require("path");
-const HtmlWebpackPlugin = require("html-webpack-plugin");
-const FriendlyErrorsWebpackPlugin = require("friendly-errors-webpack-plugin");
-const CopyPlugin = require("copy-webpack-plugin");
-const webpack = require("webpack");
-
-module.exports = {
-  entry: "./src/index.tsx",
-  resolve: {
-    extensions: [".js", ".jsx", ".ts", ".tsx"],
-  },
-  output: {
-    path: resolve(__dirname, "dist"),
-  },
-  module: {
-    rules: [
-      {
-        test: /\.mjs$/,
-        include: /node_modules/,
-        type: "javascript/auto",
-        resolve: {
-          fullySpecified: false,
-        },
-      },
-      {
-        test: [/\.jsx?$/, /\.tsx?$/],
-        use: ["ts-loader"],
-        exclude: /node_modules/,
-      },
-      {
-        test: /\.css$/,
-        use: ["style-loader", "css-loader"],
-      },
-      {
-        test: /\.(scss|sass)$/,
-        use: ["style-loader", "css-loader", "postcss-loader"],
-      },
-      {
-        test: /\.(jpe?g|png|gif|svg)$/i,
-        use: [
-          "file-loader?hash=sha512&digest=hex&name=img/[contenthash].[ext]",
-          "image-webpack-loader?bypassOnDebug&optipng.optimizationLevel=7&gifsicle.interlaced=false",
-        ],
-      },
-      {
-        test: /\.(woff|woff2|ttf)$/,
-        use: {
-          loader: "url-loader",
-        },
-      },
-    ],
-  },
-  plugins: [
-    new CopyPlugin({
-      patterns: [
-        {
-          from: "node_modules/onnxruntime-web/dist/*.wasm",
-          to: "[name][ext]",
-        },
-        {
-          from: "model",
-          to: "model",
-        },
-        {
-          from: "src/assets",
-          to: "assets",
-        },
-      ],
-    }),
-    new HtmlWebpackPlugin({
-      template: "./src/assets/index.html",
-    }),
-    new FriendlyErrorsWebpackPlugin(),
-    new webpack.ProvidePlugin({
-      process: "process/browser",
-    }),
-  ],
-};
--- a/demo/configs/webpack/dev.js
+++ b/demo/configs/webpack/dev.js
@@ -1,25 +0,0 @@
-// Copyright (c) Meta Platforms, Inc. and affiliates.
-// All rights reserved.
-
-// This source code is licensed under the license found in the
-// LICENSE file in the root directory of this source tree.
-
-// development config
-const { merge } = require("webpack-merge");
-const commonConfig = require("./common");
-
-module.exports = merge(commonConfig, {
-  mode: "development",
-  devServer: {
-    hot: true, // enable HMR on the server
-    open: true,
-    // These headers enable the cross origin isolation state
-    // needed to enable use of SharedArrayBuffer for ONNX 
-    // multithreading. 
-    headers: {
-      "Cross-Origin-Opener-Policy": "same-origin",
-      "Cross-Origin-Embedder-Policy": "credentialless",
-    },
-  },
-  devtool: "cheap-module-source-map",
-});
--- a/demo/configs/webpack/prod.js
+++ b/demo/configs/webpack/prod.js
@@ -1,22 +0,0 @@
-// Copyright (c) Meta Platforms, Inc. and affiliates.
-// All rights reserved.
-
-// This source code is licensed under the license found in the
-// LICENSE file in the root directory of this source tree.
-
-// production config
-const { merge } = require("webpack-merge");
-const { resolve } = require("path");
-const Dotenv = require("dotenv-webpack");
-const commonConfig = require("./common");
-
-module.exports = merge(commonConfig, {
-  mode: "production",
-  output: {
-    filename: "js/bundle.[contenthash].min.js",
-    path: resolve(__dirname, "../../dist"),
-    publicPath: "/",
-  },
-  devtool: "source-map",
-  plugins: [new Dotenv()],
-});
--- a/demo/package.json
+++ b/demo/package.json
@@ -1,62 +0,0 @@
-{
-  "name": "segment-anything-mini-demo",
-  "version": "0.1.0",
-  "license": "MIT",
-  "scripts": {
-    "build": "yarn run clean-dist && webpack --config=configs/webpack/prod.js && mv dist/*.wasm dist/js",
-    "clean-dist": "rimraf dist/*",
-    "lint": "eslint './src/**/*.{js,ts,tsx}' --quiet",
-    "start": "yarn run start-dev",
-    "test": "yarn run start-model-test",
-    "start-dev": "webpack serve --config=configs/webpack/dev.js"
-  },
-  "devDependencies": {
-    "@babel/core": "^7.18.13",
-    "@babel/preset-env": "^7.18.10",
-    "@babel/preset-react": "^7.18.6",
-    "@babel/preset-typescript": "^7.18.6",
-    "@pmmmwh/react-refresh-webpack-plugin": "^0.5.7",
-    "@testing-library/react": "^13.3.0",
-    "@types/node": "^18.7.13",
-    "@types/react": "^18.0.17",
-    "@types/react-dom": "^18.0.6",
-    "@types/underscore": "^1.11.4",
-    "@typescript-eslint/eslint-plugin": "^5.35.1",
-    "@typescript-eslint/parser": "^5.35.1",
-    "babel-loader": "^8.2.5",
-    "copy-webpack-plugin": "^11.0.0",
-    "css-loader": "^6.7.1",
-    "dotenv": "^16.0.2",
-    "dotenv-webpack": "^8.0.1",
-    "eslint": "^8.22.0",
-    "eslint-plugin-react": "^7.31.0",
-    "file-loader": "^6.2.0",
-    "fork-ts-checker-webpack-plugin": "^7.2.13",
-    "friendly-errors-webpack-plugin": "^1.7.0",
-    "html-webpack-plugin": "^5.5.0",
-    "image-webpack-loader": "^8.1.0",
-    "postcss-loader": "^7.0.1",
-    "postcss-preset-env": "^7.8.0",
-    "process": "^0.11.10",
-    "rimraf": "^3.0.2",
-    "sass": "^1.54.5",
-    "sass-loader": "^13.0.2",
-    "style-loader": "^3.3.1",
-    "tailwindcss": "^3.1.8",
-    "ts-loader": "^9.3.1",
-    "typescript": "^4.8.2",
-    "webpack": "^5.74.0",
-    "webpack-cli": "^4.10.0",
-    "webpack-dev-server": "^4.10.0",
-    "webpack-dotenv-plugin": "^2.1.0",
-    "webpack-merge": "^5.8.0"
-  },
-  "dependencies": {
-    "npyjs": "^0.4.0",
-    "onnxruntime-web": "^1.14.0",
-    "react": "^18.2.0",
-    "react-dom": "^18.2.0",
-    "underscore": "^1.13.6",
-    "react-refresh": "^0.14.0"
-  }
-}
--- a/demo/postcss.config.js
+++ b/demo/postcss.config.js
@@ -1,10 +0,0 @@
-// Copyright (c) Meta Platforms, Inc. and affiliates.
-// All rights reserved.
-
-// This source code is licensed under the license found in the
-// LICENSE file in the root directory of this source tree.
-
-const tailwindcss = require("tailwindcss");
-module.exports = {
-  plugins: ["postcss-preset-env", 'tailwindcss/nesting', tailwindcss],
-};
--- a/demo/src/App.tsx
+++ b/demo/src/App.tsx
@@ -1,130 +0,0 @@
-// Copyright (c) Meta Platforms, Inc. and affiliates.
-// All rights reserved.
-
-// This source code is licensed under the license found in the
-// LICENSE file in the root directory of this source tree.
-
-import { InferenceSession, Tensor } from "onnxruntime-web";
-import React, { useContext, useEffect, useState } from "react";
-import "./assets/scss/App.scss";
-import { handleImageScale } from "./components/helpers/scaleHelper";
-import { modelScaleProps } from "./components/helpers/Interfaces";
-import { onnxMaskToImage } from "./components/helpers/maskUtils";
-import { modelData } from "./components/helpers/onnxModelAPI";
-import Stage from "./components/Stage";
-import AppContext from "./components/hooks/createContext";
-const ort = require("onnxruntime-web");
-/* @ts-ignore */
-import npyjs from "npyjs";
-
-// Define image, embedding and model paths
-const IMAGE_PATH = "/assets/data/dogs.jpg";
-const IMAGE_EMBEDDING = "/assets/data/dogs_embedding.npy";
-const MODEL_DIR = "/model/sam_onnx_quantized_example.onnx";
-
-const App = () => {
-  const {
-    clicks: [clicks],
-    image: [, setImage],
-    maskImg: [, setMaskImg],
-  } = useContext(AppContext)!;
-  const [model, setModel] = useState<InferenceSession | null>(null); // ONNX model
-  const [tensor, setTensor] = useState<Tensor | null>(null); // Image embedding tensor
-
-  // The ONNX model expects the input to be rescaled to 1024. 
-  // The modelScale state variable keeps track of the scale values.
-  const [modelScale, setModelScale] = useState<modelScaleProps | null>(null);
-
-  // Initialize the ONNX model. load the image, and load the SAM
-  // pre-computed image embedding
-  useEffect(() => {
-    // Initialize the ONNX model
-    const initModel = async () => {
-      try {
-        if (MODEL_DIR === undefined) return;
-        const URL: string = MODEL_DIR;
-        const model = await InferenceSession.create(URL);
-        setModel(model);
-      } catch (e) {
-        console.log(e);
-      }
-    };
-    initModel();
-
-    // Load the image
-    const url = new URL(IMAGE_PATH, location.origin);
-    loadImage(url);
-
-    // Load the Segment Anything pre-computed embedding
-    Promise.resolve(loadNpyTensor(IMAGE_EMBEDDING, "float32")).then(
-      (embedding) => setTensor(embedding)
-    );
-  }, []);
-
-  const loadImage = async (url: URL) => {
-    try {
-      const img = new Image();
-      img.src = url.href;
-      img.onload = () => {
-        const { height, width, samScale } = handleImageScale(img);
-        setModelScale({
-          height: height,  // original image height
-          width: width,  // original image width
-          samScale: samScale, // scaling factor for image which has been resized to longest side 1024
-        });
-        img.width = width; 
-        img.height = height; 
-        setImage(img);
-      };
-    } catch (error) {
-      console.log(error);
-    }
-  };
-
-  // Decode a Numpy file into a tensor. 
-  const loadNpyTensor = async (tensorFile: string, dType: string) => {
-    let npLoader = new npyjs();
-    const npArray = await npLoader.load(tensorFile);
-    const tensor = new ort.Tensor(dType, npArray.data, npArray.shape);
-    return tensor;
-  };
-
-  // Run the ONNX model every time clicks has changed
-  useEffect(() => {
-    runONNX();
-  }, [clicks]);
-
-  const runONNX = async () => {
-    try {
-      if (
-        model === null ||
-        clicks === null ||
-        tensor === null ||
-        modelScale === null
-      )
-        return;
-      else {
-        // Preapre the model input in the correct format for SAM. 
-        // The modelData function is from onnxModelAPI.tsx.
-        const feeds = modelData({
-          clicks,
-          tensor,
-          modelScale,
-        });
-        if (feeds === undefined) return;
-        // Run the SAM ONNX model with the feeds returned from modelData()
-        const results = await model.run(feeds);
-        const output = results[model.outputNames[0]];
-        // The predicted mask returned from the ONNX model is an array which is 
-        // rendered as an HTML image using onnxMaskToImage() from maskUtils.tsx.
-        setMaskImg(onnxMaskToImage(output.data, output.dims[2], output.dims[3]));
-      }
-    } catch (e) {
-      console.log(e);
-    }
-  };
-
-  return <Stage />;
-};
-
-export default App;
--- a/demo/src/assets/data/dogs.jpg
+++ b/demo/src/assets/data/dogs.jpg
--- a/demo/src/assets/index.html
+++ b/demo/src/assets/index.html
@@ -1,18 +0,0 @@
-<!DOCTYPE html>
-<html lang="en" dir="ltr" prefix="og: https://ogp.me/ns#" class="w-full h-full">
-  <head>
-    <meta charset="utf-8" />
-    <meta
-      name="viewport"
-      content="width=device-width, initial-scale=1, shrink-to-fit=no"
-    />
-    <title>Segment Anything Demo</title>
-
-    <!--  Meta Tags -->
-    <meta property="og:type" content="website" />
-    <meta property="og:title" content="Segment Anything Demo" />
-  </head>
-  <body class="w-full h-full">
-    <div id="root" class="w-full h-full"></div>
-  </body>
-</html>
--- a/demo/src/assets/scss/App.scss
+++ b/demo/src/assets/scss/App.scss
@@ -1,3 +0,0 @@
-@tailwind base;
-@tailwind components;
-@tailwind utilities;
--- a/demo/src/components/Stage.tsx
+++ b/demo/src/components/Stage.tsx
@@ -1,49 +0,0 @@
-// Copyright (c) Meta Platforms, Inc. and affiliates.
-// All rights reserved.
-
-// This source code is licensed under the license found in the
-// LICENSE file in the root directory of this source tree.
-
-import React, { useContext } from "react";
-import * as _ from "underscore";
-import Tool from "./Tool";
-import { modelInputProps } from "./helpers/Interfaces";
-import AppContext from "./hooks/createContext";
-
-const Stage = () => {
-  const {
-    clicks: [, setClicks],
-    image: [image],
-  } = useContext(AppContext)!;
-
-  const getClick = (x: number, y: number): modelInputProps => {
-    const clickType = 1;
-    return { x, y, clickType };
-  };
-
-  // Get mouse position and scale the (x, y) coordinates back to the natural
-  // scale of the image. Update the state of clicks with setClicks to trigger
-  // the ONNX model to run and generate a new mask via a useEffect in App.tsx
-  const handleMouseMove = _.throttle((e: any) => {
-    let el = e.nativeEvent.target;
-    const rect = el.getBoundingClientRect();
-    let x = e.clientX - rect.left;
-    let y = e.clientY - rect.top;
-    const imageScale = image ? image.width / el.offsetWidth : 1;
-    x *= imageScale;
-    y *= imageScale;
-    const click = getClick(x, y);
-    if (click) setClicks([click]);
-  }, 15);
-
-  const flexCenterClasses = "flex items-center justify-center";
-  return (
-    <div className={`${flexCenterClasses} w-full h-full`}>
-      <div className={`${flexCenterClasses} relative w-[90%] h-[90%]`}>
-        <Tool handleMouseMove={handleMouseMove} />
-      </div>
-    </div>
-  );
-};
-
-export default Stage;
--- a/demo/src/components/Tool.tsx
+++ b/demo/src/components/Tool.tsx
@@ -1,73 +0,0 @@
-// Copyright (c) Meta Platforms, Inc. and affiliates.
-// All rights reserved.
-
-// This source code is licensed under the license found in the
-// LICENSE file in the root directory of this source tree.
-
-import React, { useContext, useEffect, useState } from "react";
-import AppContext from "./hooks/createContext";
-import { ToolProps } from "./helpers/Interfaces";
-import * as _ from "underscore";
-
-const Tool = ({ handleMouseMove }: ToolProps) => {
-  const {
-    image: [image],
-    maskImg: [maskImg, setMaskImg],
-  } = useContext(AppContext)!;
-
-  // Determine if we should shrink or grow the images to match the
-  // width or the height of the page and setup a ResizeObserver to
-  // monitor changes in the size of the page
-  const [shouldFitToWidth, setShouldFitToWidth] = useState(true);
-  const bodyEl = document.body;
-  const fitToPage = () => {
-    if (!image) return;
-    const imageAspectRatio = image.width / image.height;
-    const screenAspectRatio = window.innerWidth / window.innerHeight;
-    setShouldFitToWidth(imageAspectRatio > screenAspectRatio);
-  };
-  const resizeObserver = new ResizeObserver((entries) => {
-    for (const entry of entries) {
-      if (entry.target === bodyEl) {
-        fitToPage();
-      }
-    }
-  });
-  useEffect(() => {
-    fitToPage();
-    resizeObserver.observe(bodyEl);
-    return () => {
-      resizeObserver.unobserve(bodyEl);
-    };
-  }, [image]);
-
-  const imageClasses = "";
-  const maskImageClasses = `absolute opacity-40 pointer-events-none`;
-
-  // Render the image and the predicted mask image on top
-  return (
-    <>
-      {image && (
-        <img
-          onMouseMove={handleMouseMove}
-          onMouseOut={() => _.defer(() => setMaskImg(null))}
-          onTouchStart={handleMouseMove}
-          src={image.src}
-          className={`${
-            shouldFitToWidth ? "w-full" : "h-full"
-          } ${imageClasses}`}
-        ></img>
-      )}
-      {maskImg && (
-        <img
-          src={maskImg.src}
-          className={`${
-            shouldFitToWidth ? "w-full" : "h-full"
-          } ${maskImageClasses}`}
-        ></img>
-      )}
-    </>
-  );
-};
-
-export default Tool;
--- a/demo/src/components/helpers/Interfaces.tsx
+++ b/demo/src/components/helpers/Interfaces.tsx
@@ -1,29 +0,0 @@
-// Copyright (c) Meta Platforms, Inc. and affiliates.
-// All rights reserved.
-
-// This source code is licensed under the license found in the
-// LICENSE file in the root directory of this source tree.
-
-import { Tensor } from "onnxruntime-web";
-
-export interface modelScaleProps {
-  samScale: number;
-  height: number;
-  width: number;
-}
-
-export interface modelInputProps {
-  x: number;
-  y: number;
-  clickType: number;
-}
-
-export interface modeDataProps {
-  clicks?: Array<modelInputProps>;
-  tensor: Tensor;
-  modelScale: modelScaleProps;
-}
-
-export interface ToolProps {
-  handleMouseMove: (e: any) => void;
-}
--- a/demo/src/components/helpers/maskUtils.tsx
+++ b/demo/src/components/helpers/maskUtils.tsx
@@ -1,47 +0,0 @@
-// Copyright (c) Meta Platforms, Inc. and affiliates.
-// All rights reserved.
-
-// This source code is licensed under the license found in the
-// LICENSE file in the root directory of this source tree.
-
-// Convert the onnx model mask prediction to ImageData
-function arrayToImageData(input: any, width: number, height: number) {
-  const [r, g, b, a] = [0, 114, 189, 255]; // the masks's blue color
-  const arr = new Uint8ClampedArray(4 * width * height).fill(0);
-  for (let i = 0; i < input.length; i++) {
-
-    // Threshold the onnx model mask prediction at 0.0
-    // This is equivalent to thresholding the mask using predictor.model.mask_threshold
-    // in python
-    if (input[i] > 0.0) {
-      arr[4 * i + 0] = r;
-      arr[4 * i + 1] = g;
-      arr[4 * i + 2] = b;
-      arr[4 * i + 3] = a;
-    }
-  }
-  return new ImageData(arr, height, width);
-}
-
-// Use a Canvas element to produce an image from ImageData
-function imageDataToImage(imageData: ImageData) {
-  const canvas = imageDataToCanvas(imageData);
-  const image = new Image();
-  image.src = canvas.toDataURL();
-  return image;
-}
-
-// Canvas elements can be created from ImageData
-function imageDataToCanvas(imageData: ImageData) {
-  const canvas = document.createElement("canvas");
-  const ctx = canvas.getContext("2d");
-  canvas.width = imageData.width;
-  canvas.height = imageData.height;
-  ctx?.putImageData(imageData, 0, 0);
-  return canvas;
-}
-
-// Convert the onnx model mask output to an HTMLImageElement
-export function onnxMaskToImage(input: any, width: number, height: number) {
-  return imageDataToImage(arrayToImageData(input, width, height));
-}
--- a/demo/src/components/helpers/onnxModelAPI.tsx
+++ b/demo/src/components/helpers/onnxModelAPI.tsx
@@ -1,71 +0,0 @@
-// Copyright (c) Meta Platforms, Inc. and affiliates.
-// All rights reserved.
-
-// This source code is licensed under the license found in the
-// LICENSE file in the root directory of this source tree.
-
-import { Tensor } from "onnxruntime-web";
-import { modeDataProps } from "./Interfaces";
-
-const modelData = ({ clicks, tensor, modelScale }: modeDataProps) => {
-  const imageEmbedding = tensor;
-  let pointCoords;
-  let pointLabels;
-  let pointCoordsTensor;
-  let pointLabelsTensor;
-
-  // Check there are input click prompts
-  if (clicks) {
-    let n = clicks.length;
-
-    // If there is no box input, a single padding point with 
-    // label -1 and coordinates (0.0, 0.0) should be concatenated
-    // so initialize the array to support (n + 1) points.
-    pointCoords = new Float32Array(2 * (n + 1));
-    pointLabels = new Float32Array(n + 1);
-
-    // Add clicks and scale to what SAM expects
-    for (let i = 0; i < n; i++) {
-      pointCoords[2 * i] = clicks[i].x * modelScale.samScale;
-      pointCoords[2 * i + 1] = clicks[i].y * modelScale.samScale;
-      pointLabels[i] = clicks[i].clickType;
-    }
-
-    // Add in the extra point/label when only clicks and no box
-    // The extra point is at (0, 0) with label -1
-    pointCoords[2 * n] = 0.0;
-    pointCoords[2 * n + 1] = 0.0;
-    pointLabels[n] = -1.0;
-
-    // Create the tensor
-    pointCoordsTensor = new Tensor("float32", pointCoords, [1, n + 1, 2]);
-    pointLabelsTensor = new Tensor("float32", pointLabels, [1, n + 1]);
-  }
-  const imageSizeTensor = new Tensor("float32", [
-    modelScale.height,
-    modelScale.width,
-  ]);
-
-  if (pointCoordsTensor === undefined || pointLabelsTensor === undefined)
-    return;
-
-  // There is no previous mask, so default to an empty tensor
-  const maskInput = new Tensor(
-    "float32",
-    new Float32Array(256 * 256),
-    [1, 1, 256, 256]
-  );
-  // There is no previous mask, so default to 0
-  const hasMaskInput = new Tensor("float32", [0]);
-
-  return {
-    image_embeddings: imageEmbedding,
-    point_coords: pointCoordsTensor,
-    point_labels: pointLabelsTensor,
-    orig_im_size: imageSizeTensor,
-    mask_input: maskInput,
-    has_mask_input: hasMaskInput,
-  };
-};
-
-export { modelData };
--- a/demo/src/components/helpers/scaleHelper.tsx
+++ b/demo/src/components/helpers/scaleHelper.tsx
@@ -1,18 +0,0 @@
-// Copyright (c) Meta Platforms, Inc. and affiliates.
-// All rights reserved.
-
-// This source code is licensed under the license found in the
-// LICENSE file in the root directory of this source tree.
-
-
-// Helper function for handling image scaling needed for SAM
-const handleImageScale = (image: HTMLImageElement) => {
-  // Input images to SAM must be resized so the longest side is 1024
-  const LONG_SIDE_LENGTH = 1024;
-  let w = image.naturalWidth;
-  let h = image.naturalHeight;
-  const samScale = LONG_SIDE_LENGTH / Math.max(h, w);
-  return { height: h, width: w, samScale };
-};
-
-export { handleImageScale };
--- a/demo/src/components/hooks/context.tsx
+++ b/demo/src/components/hooks/context.tsx
@@ -1,31 +0,0 @@
-// Copyright (c) Meta Platforms, Inc. and affiliates.
-// All rights reserved.
-
-// This source code is licensed under the license found in the
-// LICENSE file in the root directory of this source tree.
-
-import React, { useState } from "react";
-import { modelInputProps } from "../helpers/Interfaces";
-import AppContext from "./createContext";
-
-const AppContextProvider = (props: {
-  children: React.ReactElement<any, string | React.JSXElementConstructor<any>>;
-}) => {
-  const [clicks, setClicks] = useState<Array<modelInputProps> | null>(null);
-  const [image, setImage] = useState<HTMLImageElement | null>(null);
-  const [maskImg, setMaskImg] = useState<HTMLImageElement | null>(null);
-
-  return (
-    <AppContext.Provider
-      value={{
-        clicks: [clicks, setClicks],
-        image: [image, setImage],
-        maskImg: [maskImg, setMaskImg],
-      }}
-    >
-      {props.children}
-    </AppContext.Provider>
-  );
-};
-
-export default AppContextProvider;
--- a/demo/src/components/hooks/createContext.tsx
+++ b/demo/src/components/hooks/createContext.tsx
@@ -1,27 +0,0 @@
-// Copyright (c) Meta Platforms, Inc. and affiliates.
-// All rights reserved.
-
-// This source code is licensed under the license found in the
-// LICENSE file in the root directory of this source tree.
-
-import { createContext } from "react";
-import { modelInputProps } from "../helpers/Interfaces";
-
-interface contextProps {
-  clicks: [
-    clicks: modelInputProps[] | null,
-    setClicks: (e: modelInputProps[] | null) => void
-  ];
-  image: [
-    image: HTMLImageElement | null,
-    setImage: (e: HTMLImageElement | null) => void
-  ];
-  maskImg: [
-    maskImg: HTMLImageElement | null,
-    setMaskImg: (e: HTMLImageElement | null) => void
-  ];
-}
-
-const AppContext = createContext<contextProps | null>(null);
-
-export default AppContext;
--- a/demo/src/index.tsx
+++ b/demo/src/index.tsx
@@ -1,17 +0,0 @@
-// Copyright (c) Meta Platforms, Inc. and affiliates.
-// All rights reserved.
-
-// This source code is licensed under the license found in the
-// LICENSE file in the root directory of this source tree.
-
-import * as React from "react";
-import { createRoot } from "react-dom/client";
-import AppContextProvider from "./components/hooks/context";
-import App from "./App";
-const container = document.getElementById("root");
-const root = createRoot(container!);
-root.render(
-  <AppContextProvider>
-    <App/>
-  </AppContextProvider>
-);
--- a/demo/tailwind.config.js
+++ b/demo/tailwind.config.js
@@ -1,12 +0,0 @@
-// Copyright (c) Meta Platforms, Inc. and affiliates.
-// All rights reserved.
-
-// This source code is licensed under the license found in the
-// LICENSE file in the root directory of this source tree.
-
-/** @type {import('tailwindcss').Config} */
-module.exports = {
-  content: ["./src/**/*.{html,js,tsx}"],
-  theme: {},
-  plugins: [],
-};
--- a/demo/tsconfig.json
+++ b/demo/tsconfig.json
@@ -1,24 +0,0 @@
-{
-  "compilerOptions": {
-    "lib": ["dom", "dom.iterable", "esnext"],
-    "allowJs": true,
-    "skipLibCheck": true,
-    "strict": true,
-    "forceConsistentCasingInFileNames": true,
-    "noEmit": false,
-    "esModuleInterop": true,
-    "module": "esnext",
-    "moduleResolution": "node",
-    "resolveJsonModule": true,
-    "isolatedModules": true,
-    "jsx": "react",
-    "incremental": true,
-    "target": "ESNext",
-    "useDefineForClassFields": true,
-    "allowSyntheticDefaultImports": true,
-    "outDir": "./dist/",
-    "sourceMap": true
-  },
-  "include": ["next-env.d.ts", "**/*.ts", "**/*.tsx", "src"],
-  "exclude": ["node_modules"]
-}
--- a/notebooks/automatic_mask_generator_example.ipynb
+++ b/notebooks/automatic_mask_generator_example.ipynb
--- a/onepage.py
+++ b/onepage.py
Author	SHA1	Message	Date
zjt	de39a53ee0	new readme	2023-10-29 13:57:27 +08:00
zjt	bbfe75ff14	new backend	2023-10-29 13:54:25 +08:00
zjt	040aba1654	new server frame	2023-10-22 10:59:55 +08:00
zjt	1e67935333	new server frame	2023-10-22 09:33:09 +08:00
zjt	cb7161d913	new server frame	2023-10-22 09:23:06 +08:00
zjt	dad5200040	new server frame	2023-10-22 09:15:54 +08:00
zjt	93f38c4f85	change large model dir	2023-10-21 22:07:46 +08:00
zjt	64c8342b17	ignore file	2023-10-21 21:56:30 +08:00
zjt	1c0e3e5fe7	segment anything backend	2023-10-21 21:55:27 +08:00
Hanzi Mao	6fdee8f272	Merge pull request #73 from calebrob6/visualization_speed Speeding up the visualization of masks	2023-05-02 08:32:18 -07:00
Caleb Robinson	0cfbf7ca96	Speeding up the visualization of masks	2023-05-02 03:18:33 +00:00