图像 | 小明的博客

图像处理与分类

Python有很多的数字图像处理相关的包，像PIL, Pillow, OpenCV, scikit-image等等。
其中PIL和Pillow只提供最基础的数字图像处理，功能有限。
OpenCV实际上是一个c++库，只是提供了Python接口。
scikit-image是基于SciPy的一款图像处理包，它将图片作为NumPy数组进行处理，与matlab处理方法类似**。（对图像的简单处理如截取、擦除、改变RGB某一通道的值或者拼接只需要对对应的数组进行操作即可）**
skimage包的全称是scikit-image SciKit (toolkit for SciPy)，它对SciPy.ndimage进行了扩展，提供了更多的图片处理功能。
它由Python语言编写，由SciPy 社区开发和维护。skimage包由许多的子模块组成，各个子模块提供不同的功能。

io——读取、保存和显示图片或视频；
data——提供一些测试图片和样本数据；
color——颜色空间变换；
filters——图像增强、边缘检测、排序滤波器、自动阈值等；
draw—— 操作于NumPy数组上的基本图形绘制，包括线条、矩形、圆和文本等；
transform—— 几何变换或其它变换，如旋转、拉伸和拉东变换等；
morphology——形态学操作，如开闭运算、骨架提取等；
exposure——图片强度调整，如亮度调整、直方图均衡等；
feature——特征检测与提取等；
measure——图像属性的测量，如相似性或等高线等；
segmentation——图像分割；
restoration——图像恢复；
util——通用函数。

（1）读取图像文件

skimage.io.imread(fname, as_gray=False, plugin=None, flatten=None, **plugin_args)
其中fname接收字符串，表示文件名称或URL.
读取到的图像文件储存为一个三维数组，第三维依次表示R、G、B通道。
该方法返回ndarray.

（2）显示图像

skimage.io.imshow(arr, plugin=None, **plugin_args)
arr接收数组或字符串，表示要显示的图像数据或图像文件的名字。

（3）显示搁置图像

skimage.io.show()
显示搁置的图像，常与imshow()配合使用，如在一个循环体中用imshow()方法要显示多幅图像，在循环体内这些图像将暂时搁置，在循环体外使用show()方法将它们显示。

（4）保存图像

skimage.io.imsave(fname, arr, plugin=None, check_contrast=True, **plugin_args)
fname接收字符串，表示要保存的图像的目标名字。arr接收数组，表示图像数据。

5）缩放图像

skimage.transform.rescale(image, scale, order=1, mode='reflect', cval=0, clip=True, preserve_range=False, multichannel=None, anti_aliasing=True, anti_aliasing_sigma=None)
image接收数组，表示输入的图像数据。scale接收浮点数，或浮点数元组，表示缩放比例。

（6）改变图像的大小

skimage.transform.resize(image, output_shape, order=1, mode='reflect', cval=0, clip=True, preserve_range=False, anti_aliasing=True, anti_aliasing_sigma=None)
image接收数组，表示输入的图像数据。output_shape接收元组或数组，表示要将图像改变至的大小。

（7）旋转图像

skimage.transform.rotate(image, angle, resize=False, center=None, order=1, mode='constant', cval=0, clip=True, preserve_range=False)
image接收数组，表示输入的图像数据。angle接收浮点数，表示沿逆时针方向旋转的角度。

（8）RGB图像转灰度图像

skimage.color.rgb2gray(rgb)
rgb接收RGB格式图像数据。返回灰度图像数据。

#例15-1 读取图像文件，读取格式信息，显示图像
import numpy as np
from skimage import io,transform,exposure
from skimage.transform import resize
import matplotlib.pyplot as plt
import os
path='D:/my_python/ch15/data/'
if not os.path.exists(path):
  os.makedirs(path)
img0 = io.imread(path+'/lena.png')
print('img0彩色图像的形状为：',img0.shape)
#%%
#使用skimage io imshow()方法显示图像
io.imshow(img0)
io.show()
#%% md

#%%
#使用matplotlib.pyplot imshow()方法显示图像
plt.figure(figsize=(4.5,4.5))
plt.imshow(img0)
plt.show
#%%
#例15-2 图像压缩
#使用transform.rescale()方法压缩图像
#宽、高同比例压缩
print('img0压缩后的形状为：',transform.rescale(img0, 0.5).shape)
io.imshow(transform.rescale(img0, 0.5))
io.show()
#%%
#使用resize方法压缩图像，指定压缩后的宽度、高度像素
print('img0压缩后的形状为：',resize(img0, (50,50,4)).shape)
io.imshow(resize(img0, (50,50,4)))
io.show()
#%%
#例15-4 旋转图像
io.imshow(transform.rotate(img0, 45))
io.show()
#%%
#例15-5 保存图像
io.imsave(path+'/lena_resizeed.png', resize(img0, (50,50,4)))
img_resized = io.imread(path+'/lena_resizeed.png')
io.imshow(img_resized)
io.show()
#%%
#例15-6 图像的单通道显示
io.imshow(img0[:,:,0])#显示R通道
io.show()
#%%
io.imshow(img0[:,:,1])#显示G通道
io.show()
#%%
io.imshow(img0[:,:,2])#显示B通道
io.show()
#%%
#例15-11 将原始图像转换为灰度图像
from skimage.color import rgb2gray
img_gray = rgb2gray(img0)#将彩色图像转化为灰度图像
print('img_gray灰度图像的形状为：',img_gray.shape)
io.imshow(img_gray)
io.show()
#%%
#使用matplotlib.pyplot显示灰度图像
plt.rc('font', size=14)#设置图中字号大小
plt.rcParams['font.sans-serif'] = 'SimHei'#设置字体为SimHei显示中文
plt.figure(figsize=(4,4))
plt.imshow(img_gray,cmap=plt.cm.gray)#显示灰度图像
plt.title('灰度图像')
plt.show
#%%
#例15-12 绘制灰度直方图
plt.figure(figsize=(6,4))
plt.hist(img_gray.flatten(), bins=100, density=True) 
plt.title(u'灰度直方图')
plt.show()
#%%
#例15.12 绘制原始彩色颜色通道直方图
img0 = io.imread(path+'/lena.png')
plt.figure(figsize=(6,4))
plt.hist(img0[:,:,0].flatten(), bins=100, density=1,facecolor='r',hold=1)
plt.hist(img0[:,:,1].flatten(), bins=100, density=1,facecolor='g',hold=1)
plt.hist(img0[:,:,2].flatten(), bins=100, density=1,facecolor='b',hold=1)
plt.title(u'颜色通道直方图')
plt.legend(['red','green','blue'])
plt.show()
#%%

单幅图像的特征聚类

对图像的特征进行聚类，能够发现图像中的具有相似之处的特征和不同的特征，便于图像分析和识别。

以灰度图像的行为样本进行聚类
提取将灰度值作为样本进行聚类
对原始图像进行聚类

#例15-13 以灰度图像的行（每行256个灰度值）为样本聚类
from sklearn.cluster import KMeans
#可视化原始数据和聚类结果
K=10
X=img_gray
kmeans = KMeans(n_clusters = K).fit(X)#构建并训练模型
centers=kmeans.cluster_centers_
print('簇中心的形状为：',centers.shape)
#print(centers[0,:])
labels=kmeans.labels_
#print(labels)
for i in range(K):
    #以簇中心填充簇内各个样本的值，将同一个簇显示为相同图像
    X[np.where(labels==i)]=centers[i,:]
plt.figure(figsize=(4,4))
plt.imshow(X,cmap=plt.cm.gray)
plt.title('K='+np.str_(K))
plt.show
#选择不同K值聚类，观察聚类结果
p = plt.figure(figsize=(8,8))
for K,figNum in zip([10,20,30,50],[1,2,3,4]):
    img_rescaled = transform.rescale(io.imread(path+'/lena.png'),0.5)
    img_gray = rgb2gray(img_rescaled)
    X=img_gray
    #print('X的形状为：',X.shape)
    kmeans = KMeans(n_clusters = K).fit(X)#构建并训练模型
    centers=kmeans.cluster_centers_
    labels=kmeans.labels_
    for i in range(K):
        #以簇中心填充簇内各个样本的值，将同一个簇显示为相同图像
        X[np.where(labels==i)]=centers[i,:]
    #绘制子图figNum
    ax = p.add_subplot(2,2,figNum)
    plt.imshow(X,cmap=plt.cm.gray)
    plt.title('K='+np.str_(K))
plt.show



#%%#例15-14 将每个灰度值作为样本进行聚类，提取每个簇的灰度值，可视化聚类结果
K=4
img_rescaled = transform.rescale(io.imread(path+'/lena.png'),0.5)
img_gray = rgb2gray(img_rescaled)
X=img_gray
X1=X.reshape(-1,1)#将二维灰度图像的形状改变为单特征数据集
#print('X的形状为：',X.shape)
#print('X1的形状为：',X1.shape)
kmeans = KMeans(n_clusters = K).fit(X1)#构建并训练模型
centers=kmeans.cluster_centers_
print(K,'个簇的中心为：\n',centers)
labels=kmeans.labels_
#print(labels)
#%%
#绘制每个簇的灰度图像
p = plt.figure(figsize=(8,8))
for figNum in [1,2,3,4]:
    X2=X1+0#强制生成X1的副本
    #不是本簇的样本，用灰度值1.0（白色）填充
    X2[np.where(labels!=figNum-1)]=1.0
    #print('X2:',X2.shape)
    #print('X1:',X1.shape)
    #绘制子图figNum
    ax = p.add_subplot(2,2,figNum)
    plt.imshow(X2.reshape(X.shape),cmap=plt.cm.gray)
    plt.title('聚类结果：簇'+np.str_(figNum-1))
plt.show
#%%
#观察不同K值的灰度图像聚类结果
p = plt.figure(figsize=(8,8))
for K,figNum in zip([2,4,6,8],[1,2,3,4]):
    img_rescaled = transform.rescale(io.imread(path+'/lena.png'),0.5)
    img_gray = rgb2gray(img_rescaled)
    X=img_gray
    X1=X.reshape(-1,1)
    #print('X的形状为：',X.shape)
    kmeans = KMeans(n_clusters = K).fit(X1)#构建并训练模型
    centers=kmeans.cluster_centers_
    #print('簇中心为：\n',centers)
    labels=kmeans.labels_
    #print(labels)
    for i in range(K):
        #以簇中心填充簇内各个样本的值，将同一个簇显示为相同灰度值
        X1[np.where(labels==i)]=centers[i,:]
    #绘制子图figNum
    ax = p.add_subplot(2,2,figNum)
    plt.imshow(X1.reshape(X.shape),cmap=plt.cm.gray)
    plt.title('K='+np.str_(K))
plt.show
#%%
#计算K值从1到12对应的平均畸变程度，用肘部法则来确定寻找较好的聚类数目K
#导入KMeans模块
from sklearn.cluster import KMeans
#导入scipy，求解距离
from scipy.spatial.distance import cdist
K=range(1,12)
meandistortions=[]
for k in K:
    kmeans=KMeans(n_clusters=k)
    kmeans.fit(X1)
    meandistortions.append(sum(np.min(
        cdist(X1,kmeans.cluster_centers_,
                'euclidean'),axis=1))/X1.shape[0])
#可视化
plt.figure(figsize=(6,4))
plt.grid(True)
plt.plot(K,meandistortions,'kx-')
plt.xlabel('k')
plt.ylabel(u'平均畸变程度')
plt.title(u'用肘部法则来确定最佳的K值')
plt.show() 
#%%
#例15-15 对原始彩色图像聚类
img_rescaled = transform.rescale(io.imread(path+'/lena.png'),0.5)
print('img_rescaled的形状为：',img_rescaled.shape)
#print(img_rescaled)
#io.imshow(img_rescaled)
#io.show(_rescaled)
plt.figure(figsize=(4,4))
plt.imshow(img_rescaled)
plt.title('压缩后的原始彩色图像')
plt.show
#%%
#png格式图像的形状为：(行数,列数,4)，将其形状改变为(行数*列数,4)的4特征形式
#聚类后提取每个簇颜色值，并分别可视化
K=4
img_rescaled = transform.rescale(io.imread(path+'\\lena.png'),0.5)
X=img_rescaled
X1=X.reshape(-1,4)#将颜色值形状改变为(行数*列数,4)的4特征形式
#print(X1[0:3,:])
kmeans = KMeans(n_clusters = K).fit(X1)#构建并训练模型
centers=kmeans.cluster_centers_
print(K,'个簇中心为：\n',centers)
labels=kmeans.labels_
#print(labels)
#%%
#绘制各个簇的图像
p = plt.figure(figsize=(8,8))
for figNum in [1,2,3,4]:
    X2=X1+0#强制生成X1的副本
    #不显示非本簇样本
    X2[np.where(labels!=figNum-1)]=[1,1,1,1]
    #绘制子图figNum
    ax = p.add_subplot(2,2,figNum)
    plt.imshow(X2.reshape(X.shape))
    plt.title('聚类结果：簇'+np.str_(figNum-1))
plt.show
#%%
#对原始彩色图像按照不同K值聚类，可视化聚类结果
p = plt.figure(figsize=(8,8))
for K,figNum in zip([2,4,6,8],[1,2,3,4]):
    img_rescaled =transform.rescale(io.imread(path+'\\lena.png'),0.5)
    X=img_rescaled
    #print('img_rescaled的形状为：',img_rescaled.shape)
    X1=X.reshape(-1,4)
    #print('X的形状为：',X.shape)
    kmeans = KMeans(n_clusters = K).fit(X1)#构建并训练模型
    centers=kmeans.cluster_centers_
    print('K=',K,'时的簇中心为：\n',centers)
    #print(centers.shape)
    labels=kmeans.labels_
    #print(labels)
    for i in range(K):
        #以簇中心填充簇内各个样本的值，将同一个簇显示为相同颜色
        X1[np.where(labels==i)]=centers[i,:]
    #绘制子图figNum
    ax = p.add_subplot(2,2,figNum)
    plt.imshow(X1.reshape(X.shape))
    plt.title('K='+np.str_(K))
plt.show
#%%
#计算K值从1到12对应的平均畸变程度，用肘部法则来确定寻找较好的聚类数目K
#导入KMeans模块
from sklearn.cluster import KMeans
#导入scipy，求解距离
from scipy.spatial.distance import cdist
K=range(1,12)
meandistortions=[]
for k in K:
    kmeans=KMeans(n_clusters=k)
    kmeans.fit(X1)
    meandistortions.append(sum(np.min(
        cdist(X1,kmeans.cluster_centers_,
                'euclidean'),axis=1))/X1.shape[0])

#可视化
plt.figure(figsize=(6,4))
plt.grid(True)
plt.plot(K,meandistortions,'kx-')
plt.xlabel('k')
plt.ylabel(u'平均畸变程度')
plt.title(u'用肘部法则来确定最佳的K值')
plt.show() 
#%%

图像分类

汉字手写体的识别

在对汉字图像进行分类时，需要先将图像转换为灰度图，将每一个图像作为样本。

#例15-16 读取图像，存储为文本文件，读取文本文件，使用SVM分类
import numpy as np
from skimage import io,data,transform,exposure
import matplotlib.pyplot as plt
from skimage.color import rgb2gray
import pandas as pd
import os
path='D:/my_python/ch15/data/compressed/'
if not os.path.exists(path):
  os.makedirs(path)
source_category_list = os.listdir(path)
print('compressed目录下的子目录为：\n',source_category_list)
#%%
print('图像批量读取中，请耐心等待......')
for mydir in source_category_list:
    #拼出存放原始文件的目录（即类别）路径
    source_category_path = path+ mydir + "/"
    #获取某一目录（类别）中的所有文件
    source_file_list = os.listdir(source_category_path)
    #print(source_file_list)
    file_num=len(source_file_list)#获取source_file_list文件数量
    coll=io.ImageCollection(path+mydir+'/*.jpg')
    print(source_category_path,'读取完毕')
    idx=np.random.randint(0,high=len(coll), size=10)
    p = plt.figure(figsize=(10,2))
    for fignum in range(10):
        ax1 = p.add_subplot(1,10,fignum+1)
        plt.imshow(coll[fignum])
        p.tight_layout()#调整空白，避免子图重叠
print('图像批量读取完毕！')
print('每个汉字手写体的任意10个的图像为：\n')
#%%
#生成图像数据集，一行数据（即一个样本）为一副图像
from skimage.color import rgb2gray
print('图像批量读取中，请耐心等待......')
X=[]
y=[]
for mydir in source_category_list:
    #拼出存放原始文件的目录（即类别）路径
    source_category_path = path+ mydir + "/"
    #print(source_category_path)
    #获取某一目录（类别）中的所有文件
    source_file_list = os.listdir(source_category_path)
    #print(source_file_list)
    for file_name in source_file_list:
        full_name=path+mydir+'/'+file_name
        img0=io.imread(full_name)
        img_gray = rgb2gray(img0)#将彩色图像转化为灰度图像
        X=np.append(X,img_gray.ravel())
        y=np.append(y,mydir)
    X=X.reshape(len(source_file_list),-1)
X=X.reshape(len(y),-1)
print('图像批量读取完毕！')
#%%
print('汉字手写体数据集的形状为：',X.shape)
print('汉字手写体目标集的形状为：',y.shape)
#%%
idx=np.random.randint(0,high=len(y), size=50)
print('图像数据集中任意50个样本的的图像为：\n')
p = plt.figure(figsize=(10,6))
for fignum in range(len(idx)):
    ax1 = p.add_subplot(5,10,fignum+1)
    plt.imshow(X[idx[fignum],:].reshape(np.int(len(X[idx[fignum],:])**0.5),-1))
    p.tight_layout()#调整空白，避免子图重叠
#%%
path1='D:/my_python/ch15/data/'
if not os.path.exists(path1):
  os.makedirs(path1)
df_images_data=pd.DataFrame(X)
df_images_data.to_csv(path1+'cn_writtings_data.csv',sep = ',',index = False) #保存为csv文本文件
df_images_target=pd.DataFrame(y)
df_images_target.to_csv(path1+'cn_writtings_target.csv',sep = ',',index = False) #保存为csv文本文件
print('图像数据集转换为DataFrame并保存完毕！')
#%%
#读取保存的图像数据集文件
X=pd.read_table(path1+'cn_writtings_data.csv',
      sep = ',',encoding = 'gbk').values#读取csv文本文件
y=pd.read_table(path1+'cn_writtings_target.csv',
      sep = ',',encoding = 'gbk').values#读取csv文本文件
print('读取的图像数据文件特征集形状为：',X.shape)
print('读取的图像数据文件目标集形状为：',y.shape)
#%%
idx=np.random.randint(0,high=len(y), size=50)
print('读取的图像数据文件中任意50个样本的的图像为：\n')
p = plt.figure(figsize=(10,6))
for fignum in range(len(idx)):
    ax1 = p.add_subplot(5,10,fignum+1)
    plt.imshow(X[idx[fignum],:].reshape(np.int(len(X[idx[fignum],:])**0.5),-1))
    p.tight_layout()#调整空白，避免子图重叠
#%%
from sklearn import svm
from sklearn.model_selection import train_test_split,cross_val_score
from sklearn import metrics
from sklearn.metrics import confusion_matrix,classification_report
X_train,X_test, y_train,y_test = train_test_split(
    X,y,train_size = 0.7,random_state = 42)
# 使用支持向量机进行训练
#可选择不同的核函数kernel： 'linear', 'poly', 'rbf','sigmoid'
clf_svm = svm.SVC(kernel='linear',gamma=2)#设置模型参数
clf_svm.fit(X_train, y_train.ravel())#训练
y_svm_pred=clf_svm.predict(X_test)
print('支持向量机预测测试集结果与实际结果的混淆矩阵为：\n',
      confusion_matrix(y_test.ravel(), y_svm_pred.ravel()))#输出混淆矩阵
#%%
print('支持向量机预测结果评价报告：\n',
    classification_report(y_test.ravel(),y_svm_pred.ravel()))
#%%
#交叉检验
print('交叉检验的结果为：',cross_val_score(clf_svm, X, y.ravel(), cv=5))