DataScienceClubUVU/ServiceProjectFall2023
收藏Hugging Face2023-11-09 更新2024-03-04 收录
下载链接:
https://hf-mirror.com/datasets/DataScienceClubUVU/ServiceProjectFall2023
下载链接
链接失效反馈官方服务:
资源简介:
# Deep Learning Service Project (Fall 2023)

# Getting Started
1. Clone the repository with git lfs disabled or not installed.
**ON WINDOWS**
```bash
set GIT_LFS_SKIP_SMUDGE=1
git clone https://huggingface.co/datasets/DataScienceClubUVU/ServiceProjectFall2023
```
**ON LINUX**
```bash
GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/datasets/DataScienceClubUVU/ServiceProjectFall2023
```
2. Download the pytorch file (.pth) from https://huggingface.co/datasets/DataScienceClubUVU/ServiceProjectFall2023/blob/main/mexico_5_column_weights.pth and place it in the root directory of the repository. There will be an existing file with the same name. Delete it and replace it with the new one.
3. Install the requirements.txt file using pip.
**IF YOU DON't HAVE A GPU**
```bash
pip install -r requirements_cpu.txt
```
**IF YOU HAVE A GPU**
1. Install reqs without torch
```bash
pip install -r requirements_no_torch.txt
```
2. Install pytorch by following the instructions at https://pytorch.org/get-started/locally/
### How the Model Works
1. Start with an image of a character of text:
- 
2. Convert the image between RGB/BGR and grayscale using the _**cvtColor**_ function from the _**cv2**_ library:
- 
3. Use an Adaptive Thresholding approach where the threshold value = Gaussian weighted sum of the neighborhood values - constant value. In other words, it is a weighted sum of the blockSize^2 neighborhood of a point minus the constant. in this example, we are setting the maximum threshold value as 255 with the block size of 155 and the constant is 2.
- 
4. Create a 3x3 matrix of ones to generate an image kernel. An _**image kernel**_ is a small matrix used to apply effects like the ones you might find in Photoshop or Gimp, such as blurring, sharpening, outlining or embossing. They're also used in machine learning for 'feature extraction', a technique for determining the most important portions of an image.
5. The basic idea of erosion is just like soil erosion only, it erodes away the boundaries of foreground object (Always try to keep foreground in white). It is normally performed on binary images. It needs two inputs, one is our original image, second one is called structuring element or kernel which decides the nature of operation. A pixel in the original image (either 1 or 0) will be considered 1 only if all the pixels under the kernel is 1, otherwise it is eroded (made to zero).
- 
6. The basic idea of dilation is accentuating the features of the images. Whereas erosion is used to reduce the amount of noise in the image, dilation is used to enhance the features of the image.
- 
7. Traditionally, a line can be represented by the equation **_y=mx + b_** (where **_m_** is the slope and **_b_** is the intercept). However, a line can also be represented by the following equation: **_r= x(cos0) + y(sin0)_** (where **_r_** is the distance from the origin to the closest point on the straight line). **_(r,0)_** corresponds corresponds to the **_Hough space_** representation of the line. In this case, **_0_** is known as **_theta_**.
- For a given point in a two-dimensional space (think of a basic x- and y-axis graph), there can be an infinite number of straight lines drawn through the point. With a **_Hough Transform_**, you draw several lines through the point to create a table of values where you conclude "for given theta (angle between the x-axis and r-line that will match with the closest point on the straight line), we can expect this "r" value".
- Once you have created your table of values for each point on a given two-dimensional space, you compare the r-values on each theta for each given point and select the r and theta where the difference between the point is the least (this means the line best represents the points on the space).
提供机构:
DataScienceClubUVU
原始信息汇总
数据集概述
数据集获取
-
克隆仓库:
-
Windows: bash set GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/datasets/DataScienceClubUVU/ServiceProjectFall2023
-
Linux: bash GIT_LFS_SKIP_SMUDGE=1 git clone https://huggingface.co/datasets/DataScienceClubUVU/ServiceProjectFall2023
-
-
下载PyTorch文件:
- 从此处下载
.pth文件,并将其放置在仓库的根目录中。
- 从此处下载
-
安装依赖:
-
无GPU: bash pip install -r requirements_cpu.txt
-
有GPU:
-
安装不含Torch的依赖: bash pip install -r requirements_no_torch.txt
-
按照PyTorch官网的指示安装PyTorch。
-
-
模型工作原理
-
图像处理:
- 从文本字符的图像开始。
- 使用
cv2库的cvtColor函数在RGB/BGR和灰度之间转换图像。 - 使用自适应阈值处理,其中阈值值 = 邻域值的高斯加权和 - 常数值。
-
图像增强:
- 创建一个3x3的单位矩阵作为图像核。
- 使用腐蚀操作来消除前景对象的边界。
- 使用膨胀操作来突出图像的特征。
-
线条检测:
- 使用霍夫变换来检测线条。
- 通过比较每个点的
r值和theta值,选择最能代表空间中点的线条。



