当前位置:首页 >> 信息与通信 >>

Content-based image authentication:current status, issues, and challenges


Int. J. Inf. Secur. (2010) 9:19–32 DOI 10.1007/s10207-009-0093-2

REGULAR CONTRIBUTION

Content-based image authentication: current status, issues, and challenges
Shui-Hua Han · Chao-Hsien Chu

Published online: 8 October 2009 ? Springer-Verlag 2009

Abstract With today’s global digital environment, the Internet is readily accessible anytime from everywhere, so does the digital image manipulation software; thus, digital data is easy to be tampered without notice. Under this circumstance, integrity veri?cation has become an important issue in the digital world. The aim of this paper is to present an in-depth review and analysis on the methods of detecting image tampering. We introduce the notion of content-based image authentication and the features required to design an effective authentication scheme. We review major algorithms and frequently used security mechanisms found in the open literature. We also analyze and discuss the performance tradeoffs and related security issues among existing technologies. Keywords Image hash · Content-based authentication · Performance trade-off · Security · Image tampering

1 Introduction Recent advancements in communication infrastructure, signal processing, and digital storage technologies have enabled pervasive digital media distribution. Digital distribution
This work was part supported by National Science Foundation of China (No. 70971112) and Program for New Century Excellent Talents in Fujian Province University (No. X04139). S.-H. Han (B) School of Management, Xiamen University, 361005 Xiamen, People’s Republic of China e-mail: hansh@xmu.edu.cn C.-H. Chu College of Information Sciences and Technology, The Pennsylvania State University, University Park, PA 16802, USA

introduces a ?exible and cost-effective business model that is bene?cial to multimedia commerce transactions. The digital nature of the information also allows individuals to manipulate, duplicate, or access media beyond the conditions agreed upon for a given transaction. Under this circumstance, integrity veri?cation has become an important issue in the digital world. In real world applications, hash function is typically used for digital signatures to authenticate the message being sent so that the recipients can verify its source. A key concern on using conventional hashing algorithms such as MD5 and SHA-1 in image authentication is that they are extremely sensitive to the change of message; i.e., a bit change in the input alters the output dramatically. While this level of sensitivity usually meets the need of authenticating text messages, the authenticity for multimedia is not as straightforward. Multimedia data such as digital images often needs to go through various manipulations such as compression, enhancement, cropping, and scaling. Therefore, a bit-by-bit veri?cation is not a suitable way to authenticate multimedia data. Although traditional data authentication technology for message integrity veri?cation was mature, image content authentication is still in its early development stage, and many fundamental questions remain open [19]. For instance, given a number of different algorithms developed over the past years, it is dif?cult to af?rm which approach seems most suitable for ensuring the integrity adapted to images and in a more general way to multimedia documents. There is a need for synthesizing literature to understand the nature of problems, identify potential research issues, standardize new research area, and evaluate the relative performance of different approaches. The aim of this paper is to examine the status and issues of content-based image authentication methods and to assess their strengths, weaknesses and relative

123

20 Fig. 1 Architecture of a content-based authentication system. a Generation of content-based hash. b Verify process of a content-based hash

S.-H. Han, C.-H. Chu

performance in term of robustness and security. We focus our attention and research on image hashes. This paper is organized as follows. The notion of contentbased image authentication and framework are brie?y introduced, followed by a discussion of the performance requirements for image hashing. Section 4 provides a concise review of existing methods for image authentication. Security related issues and performance trade-off are discussed in details in Sects. 5 and 6. Finally, challenges and future research direction are presented.

Table 1 Content-preserving and content changing manipulations Content-preserving manipulations ? ? ? ? ? Transmission errors Noise Compression and quantization Resolution reduction Content-changing manipulations ? ? ? ? Removing image objects Moving of image elements or changing their positions Adding new objects Changes of image characteristics: color, textures, structure, impression, etc. Changes of the image background: day time or location (e.g., forest, ocean) Changes of light conditions: shadow manipulations etc.

Scaling Rotation Cropping γ Distortion Color conversions Contrast adjustment Changes of brightness, hue and saturation ? ?

2 Content-based image authentication The main concept behind image authentication is to extract the image characteristics (or contents) of the human perception and use them during the authentication process. A generic architecture of a typical content-based authentication system is shown in Fig. 1. To generate an image hash, a secret key is often used to extract certain features from the data. These features are further processed to form the hash. The hash is transmitted along with the media either by appending or embedding it to the primary media data. At the receiver side, the authenticator uses the same key to generate the hash values, which are compared to the ones transmitted along with the data for verifying its authenticity. Typically, some applications may require a compression to satisfy the resource constraints on bandwidth or storage space. Some applications may also need to enhance the image quality, crop the image, change the size, or perform some other operations. Content-preserving manipulations only change the pixel values, which results in different levels of visual distortion in the image, but the contents of the image, which carries the same visual meaning to the observer, are still preserved. On the other hand, malicious (or contentchanging) manipulations change the image to a new one, which carries a different visual meaning to the observer. One typical example of malicious modi?cation is replacing some

? ? ? ? ? ?

parts of the image with different contents. A rough classi?cation of content-preserving and content-changing manipulations is given in Table 1. Authentication signatures are expected to be able to survive on acceptable content-preserving manipulations and reject other malicious manipulations.

3 Performance requirements for image hashing Let I denote a set of images (e.g., all natural images of a particular size) with ?nite cardinality. K denotes the space of secret keys. Assuming that H (.) is an image hash function, we can obtain an output of hash value h = HK (I ). For image authentication purpose, the hash function should meet the design requirements of an authentication. Some

123

Content-based image authentication Fig. 2 Image hash based on the statistical vector from wavelet

21

1. Obtain an L-level wavelet decomposition of the image. Let IDC represent the DC sub-band. 2. Divide IDC into overlapping and pseudorandomly selected rectangular regions {Ri}, i = 1, 2, … P. 3. For each region, obtain a statistics by taking an inner product with a smoothly varying Gaussian random field. 4. Collect statistics from each region into a length P statistical vector ?. Scalar uniform quantization to ? gives the hash vector.

common indicators for measuring the performance of image hash include: (1) Robustness. A hash function H (.) is said to satisfy the robustness property, if the same key is used in a set of perceptual similar images and produced similar hash values. Probabilty (||(HK (I ) ? Hk (Iident )|| < τ ) ≥ 1 ? θ1 , for a given θ1 and ?I, Iident (2) Fragility. A hash function H (.) is said to satisfy the fragility property, if the same key is used in a set of perceptual dissimilar images and produced dissimilar hash values. Probabilty (||(HK (I ) ? Hk (Idiff )|| > τ ) ≥ 1 ? θ2 , for a given θ2 and ?I, Idiff (3) One-way property. Given an output y, for which there exists an xsuch that HK (x) = y; it is computationally infeasible to compute any preimage x such that HK (x ) = y. (4) Collision resistant. Given any preimage x, it is computationally infeasible to ?nd a second preimage x = x such that HK (x) = Hk (x ) (5) Key secret. Without the knowledge of the key, the hash value should not be easily forged or estimated because the image hash was generated by using a secret key.

histogram. Schneider and Chang [20] suggested the use of selected intensity statistics such as mean and variance from image blocks to create an image hash. They conjectured that such statistics have good robustness properties under small perturbations to the image. A serious drawback with this method is that it is easy to modify an image without altering its histogram. This jeopardizes the security properties of any scheme that relies on intensity statistics. Venkatesan et al. [24] developed an image hash based on the statistical vector extracted from various sub-bands in a wavelet decomposition of the image (shown as Fig. 2). Although statistics of wavelet coef?cients have been found to be far more robust than intensity statistics, they do not necessarily capture content changes well, particularly those that are maliciously generated. As a result, a new method based upon moment normalization [1] was employed to achieve better geometric invariance (shown as Fig. 3). Similar methods like Radon transformation [22] and Zernike moments [7] were also employed. The major limitation of these methods is their inability to resist attacks related to cropping because of the loss of contents leading to the changes of moments.

4.2 Relation-based approaches This category of approach is based upon the invariant relation between a pair of transformation coef?cients. Lin and Chang [9] proposed a block discrete cosine transform (DCT)-based digital signature method for image authentication (shown as Fig. 4). The scheme exploited the invariance of the relationship between selected DCT coef?cients at the same position in separate blocks of the image to differentiate JPEG compression from malicious operations. The relationship is preserved even when DCT coef?cients are requantized. As a result, the authentication scheme can distinguish malicious operations from JPEG compression regardless of the compression ratio and number of compression iterations. Such a feature is robust to JPEG compression but is very fragile to other insignificant modi?cations. Lu and Liao [10] proposed a “structural digital signature” for image authentication (shown as Fig. 5). They observed that in sub-band wavelet decomposition, a parent and child nodes are uncorrelated, but they are statistically dependent. In

4 State of the art review Image authentication in general and image hash in speci?c has received considerable attention by academia and practitioners over the last few years. Many algorithms and their variations have been proposed. These techniques can be roughly classi?ed into four different categories: methods based on image statistics, methods based on relations between transformation coef?cients, methods preserved the coarse image representation, and methods used low-level image feature extraction. 4.1 Statistics-based approaches This group of method is based upon selected invariant statistics such as mean, variance, and higher moments of image

123

22 Fig. 3 Moment normalization-based method
1.

S.-H. Han, C.-H. Chu

Calculate the geometric moments mp,q of a grayscale image f(x,y):
m p, q = ∫ ∫ x p y q f ( x, y )dxdy
Γ

(1)

where

is the support of the image.

2. Modify the image to achieve a new invariants value called central moments as below:

? p,q = ∫ ∫ ( x ? x ) p ( y ? y ) q f ( x, y )dxdy
Γ

(2)

3. Use the central moments to create a normalizing environment for geometrically manipulated images. The normalized central moments become:

η p ,q =

? p ,q
( ? 0,0 ) γ

, γ = ( p + q + 2)

2

(3)

4. Apply (3) to normalize the flipping direction, translation, scale, and orientation of the image.

Fig. 4 DCT-based digital signature method

1. Divide the original image into 8 × 8 blocks; 2. Form block pairs using a predetermined secret mapping function; 3. For each block pair (p, q): a. b. Select a set of n DCT coefficients, Bp; Generate the binary signature p of the block pair such that: ?1, F p (v) ? Fq (v) ≥ 0, ?0, F p (v) ? Fq (v) < 0,

φ p (v ) = ?

(4)

where ∈ Bp and F( ) is the value of .

Fig. 5 Structural digital signature

1. Compute the DWT of an image. Where the size of the lowest frequency band is fixed to be 16*16; 2. Select those parent-child pairs with their magnitude difference larger than a pre-determined threshold .We consider this kind of pairs significant. In fact, is determined from the desired false positive and false negative probabilities. 3. For each selected pair, <p, c>, it is classified as one of four types defined as follows: Type I: p > 0 and |p| > |c|; Type III: c > 0 and |p| < |c|; Type II: p < 0 and |p| > |c|; Type IV: c < 0 and |p| < |c|.

4. Initially, SDS[i, j] = V for i, j. The SDS array is recorded as SDS[i, j] = I or II or III or IV according to step 3, where [ i, j] is a child’s coordinate of a significant pair in the wavelet domain.

particular, they observed that the difference of the magnitude of wavelet coef?cients at consecutive scales (i.e., a parent and its four child nodes) remained largely preserved for several content-preserving manipulations. Identifying such parent– child pairs and subsequently encoding the pairs form their digital signature. Their scheme, however, is very sensitive

to global (e.g., small rotations and bending) as well as local geometric distortions, which do not cause perceptually significant changes to the image. Recently, Xiang et al. [28] proposed a histogram-based image hashing scheme, which exploited the invariance of the relationship in the number of pixels among groups of two different bins of histogram

123

Content-based image authentication Fig. 6 Visual hash generation process

23

Fig. 7 Iterative geometric methods

1. Find the DWT of X up to level L. Let XA be the resulted DC sub-band. 2. Perform the following threshold operation on XA to produce the binary map M: ?1, if X A (i, j ) ≥ T M (i, j ) = ? Otherwise ?0, T is chosen such that the W (M) ≈ q, where 0 ≤ q ≤ 1 is a parameter of the algorithm. 3. (Geometric region growing): Let M1 = M, ctr = 1. 3.1. (Order-statistics filtering on M1): M2 := S[m,n];p (M1) where m, n and p are algorithm parameters. 3.2. Apply 2-dimensional linear shift-invariant filtering on M3 via filter f, where M3 (i, j) = AM2 (i, j); f and A are algorithm parameters. Let the output be M4. 3.3. Apply a threshold operation on M4. This operation is similar to the one explained in step 2. Let M5 be the output, such that W (M5) ≈q. 3.4. If ctr ≤ C, terminate the iteration and go to step 4. If this is not the case, find D(M5;M1); if it is less than 2 terminate the iteration and go to step 4; if not, set M1 = M5, ctr = ctr + 1 and go to step 3.1. 4. H (X) = M5. (5)

shape. Such a method has a satisfactory performance to various geometric deformations. 4.3 Preserving coarse image representations This group of approach is based on the coarse image features. Fridrich and Goljan [4] proposed a hash functions for image data based upon the visual hash function (VHF) of the image. They observed that the magnitude of a low-frequency DCT coef?cient of an image cannot be changed easily without causing visible changes to the image. To make the procedure dependent on a key, the DCT basis vectors are replaced with low-frequency, DC-free random smooth patterns generated from a secret key with projections onto the patterns being equivalent to DCT coef?cients. More specifically, using a secret key K , N random matrices with entries uniformly distributed in the interval [0, 1] are generated. Then, the random matrices are low-pass ?ltered and made DC-free by subtracting the mean from them to yield random but smooth patterns,

P(i)(1 < i < N ). Then, for a particular image block B, N hash bits are generated, shown as Fig. 6. Mih?ak and Venkatesan [14] developed another image hashing algorithm by using an iterative ?ltering technique (shown as Fig. 7) that minimizes the presence of “geometrically weak components” and enhance the “geometrically strong components” by means of region growing. A region which has isolated significant components (geometrically weak) is a good candidate to be erased via modi?cations, whereas a region which has massive clusters of significant components (geometrically strong) would probably remain. Kozat et al. [8] proposed an image hashing scheme by retaining the strongest components of the Singular Value Decomposition (SVD) of image blocks (shown as Fig. 8). Then applying SVD to pseudo-randomly-chosen semi-global regions of images and employing the singular vectors to extract robust features in crucial steps of the hashing algorithms (instead of the usage of singular values of the whole

123

24 Fig. 8 SVD-based image hashing

S.-H. Han, C.-H. Chu

1. Let the n*n input image be i

Rnxn .

2. From i, pseudo-randomly form p possibly overlapping rectangles (each of them of size m*m): Ai Rmxm, 1 ≤ i ≤p .

3. Generate a feature vector gi from each rectangle Ai via the transformation gi = T1(Ai) 4. Construct a secondary image j by using a PR combination of intermediate feature vectors {g1, …, gp}. 5. From j, pseudo-randomly form r possibly overlapping rectangles (each of them of size dxd ): Bi Rdxd, 1≤ i ≤r;

6. Generate a final feature vector fi from each rectangle Bi via the transformation fi = T2(Bi) . 7. Combine {f1, f2,…, fr} to form the final hash vector.

Fig. 9 Fourier–Mellin transformation (FMT)-based image hashing

(1) Preprocessing: First, apply a low-pass filter on the input image and down-sample it, then perform histogram equalization on the down-sampled image to get I(x, y).We take a Fourier transform on the preprocessed image to obtain I(fx, fy), using a polar coordinate representation fx = cos( ), fy = sin( ), ∈ [0,1]. The Fourier transform output is converted into polar coordinates to arrive at | I ' ( ρ ,θ ) |=| σ | ? 2 | I ( ρσ ?1 ,θ ? α ) | . (2) Feature Generation: obtain I’( ) as in step (4) of Fig. 8, sum up I’( ) along the –axis at equidistant points in the range of [0, 2 ) i.e., for
(2i + 1)π ) k
2

{ /K, 3 /K,…, (2K-1) /K} to obtain jth

hash value as in step (5) of Fig. 8, K = 360 is used in its implementation.
k ?1 h j = ∑ i = 0 β ρj , i I ' (ρj ,

(6)

Where {

,j,i}are

key-dependent pseudo-random numbers that are normally distributed with

mean m and variance

(3) Post Processing: quantize the resulted statistics vector and apply gray coding to obtain the binary hash sequence.

image). Its robustness against geometric attacks motivates other solutions in this direction. Swaminathan et al. [23] proposed an image hashing scheme by retaining rotation invariant coef?cient of Fourier–Mellin transformation (shown as Fig. 9). Monga and Mihcak [16] introduced another dimension reduction technique, which is called non-negative matrix factorization (NMF) into their new hash algorithm (shown as Fig. 10). The NMF hashing possesses excellent robustness under a large class of perceptually insignificant attacks while significantly reducing misclassi?cation for perceptually distinct images. Recently, Lv and Wang [11] proposed a new hash algorithm based on Fast Johnson–Lindenstrauss transform (FJLT) (shown as Fig. 11). These coarse features-based approaches possess excellent robustness to most malicious attacks. We will discuss them in more details in Sect. 4.5.

4.4 Low-level image representations This category of approach is based-on low-level image features such as edges or feature points [2] (shown as Fig. 12) as they have contained content of the image. But such method still has limitations that it is sensitive to some perceptual insignificant modi?cations such as scale, high quantization, and resolution reduction. Its robustness is not good either. Monga and Evans [15] proposed a perceptual image hash frame using feature points, based on the characteristics of the visual system to respond to extremely robust image features such as corner like stimuli and points of high curvature (shown as Fig. 13). They ?rst extracted significant image features by using an “end-stopped” wavelet-based feature detection algorithm. Further, an iterative procedure is used to lock onto a set of image feature-points with excellent invariance

123

Content-based image authentication Fig. 10 NMF-based image hashing

25

1. Given an image, pseudorandomly select P subimages, A i ∈ R there is no restriction on the subimages to be disjoint.

m×m

; 1 ≤ i ≤ p . Note

2. Obtain a rank r1non-negative matrix factorization (NMF) from each subimage (r1 <<m)
T A i ≈ W i Fi

where Wi and Fi are both m × r1 . This results in 2p NMF matrices of size m × r1 each. 3. Pseudorandomly arrange these matrices to obtain a secondary image J of size m × 2 pr1 . 4. Reapply NMF to obtain a rank r2 representation of J, r2 << min(m,2 pr1 )
J ≈ WF

where W is m × r2 and H is r2 × 2 pr1 . 5. The concatenation of columns of W and rows of H give the hash vector.

Fig. 11 FJLT-based image hashing

1. Random Sampling: given an image, pseudorandomly select P subimages, m×m ; 1 ≤ i ≤ p . Each Si is a m 2 × 1 vector by concatenating the columns of Si∈ R corresponding subimage. Then we construct our original feature as:
feature = {S1 , S 2 , S p} with m 2 × p

2. Dimension Reduction by FJLT: get three real-valued matrices P, H, and D by construction Φ = FJLT ( P , m 2 , ε ) where H is deterministic but P and D are pseudorandomly dependent on secret key. Then we can get the intermediate hash (IH) as:
IH = Φ ( feature ) = PHD × feature

3. Random Weight Incorporation: introduce the pseudorandom weight vectors {W i } iP= 1 and get final secure hash as
Hash = { < IH 1 , W 1 > , < IH 2 , W 2 > , < IH p , W p > }

where <IHi, Wi> is the inner product of vector IHi and Wi.

Fig. 12 Low-level image features based method

1. Extract feature using Canny edge detector E. CI = E(I) 2. Transform this image into a binary contour characteristic - Binary edge pattern: EPc1 = f(CI) 3. Make a VLC for data reduction to produce the feature code: VLC(EPc1 ) 4. Use the signature generation sign to calculate the hash over the VLC code and signs the hash value, instead of using the VLC code directly. sigI = sign(Hash(VLC(Epcl)))PrivateKey

properties to perceptually insignificant perturbations. Due to its intrinsic sensitivity to content tamper in feature point detector, such a method can gain good robustness. 4.5 Robustness performance assessment The robustness of image hash has been widely discussed in the literature. However, an objective evaluation on available methods is still lacking. We use the miscellaneous volume in the USC-SIPI image database [21] to evaluate the relative performance of hashing algorithms. The six typical classes of manipulations and attacks considered are: additive Gaussian

noise, Median ?lter, JPEG compression, rotation, cropping, and scaling. The details of the corresponding parameters are listed in Table 2. We have also tested the performance under several content changing attacks, including object insertion and removal, alteration of the position of image elements, and alteration of a significant image characteristic such as texture and structure. In all cases, the generalized Hausdorff distance between the features of original and attacked images was used to measure fragility to content changing attacks. The relative robustness of the mentioned methods against these manipulations is showed in Table 3.

123

26 Fig. 13 Perceptual image hash

S.-H. Han, C.-H. Chu

1. Get parameters Max Iteration , and P, set count = 1 2. Use an “end-stopped” wavelet-based feature detector to extract the length P vector of feature points f 3. Quantize f in a probabilistic sense to obtain a binary string b1f 4. Perform order-statistics filtering. Let Ios = OS(I; p, q, r) which is the 2-D order statistics filtering of the input I. For a 2-D input X, Y = OS(X; p, q, r) where i, j, Y (i, j) is equal to

the rth element of the sorted set of X(i0, j0), where i0 ∈ {i ? p, i ? p + 1, ..., i + p} and j0 ∈ 2 {j ? q, j ? q + 1, ..., j + q}. 5. Perform low-pass linear shift invariant filtering on Ios to obtain Ilp. 6. Repeat steps (2) and (3) with Ilp to obtain b2f 7. If (count = maxIter) go to step 8. else if DH(b1f , b2f ) < , go to step 8. else set I = Ilp and go to step 2. 8. Set H(I) = b2f

Table 2 Parameters and PSNR for manipulated images Manipulation Gaussian noise Median ?lter JPEG compression Rotation Cropping (%) Scaling (%) Parameters and PSNR Variance = (0.05–0.4, 8) order = (2–11,1) QF = {1–99%} 20 , 30 , 50 , 100 , 150 , 200 , 300 , 400 2, 3, 5, 10, 20, 25, 30, 40 5, 10, 15, 20, 25, 50, 100, 200

5 Security issues of image hashing 5.1 Attacks on image hashing Generally speaking, image damages in question are often assumed to be corrupted by natural processes such as noise or misused and are often neglected. With today’s advanced information technology and manipulation software are readily accessible in the Internet, images may be altered intentionally or can be calibrated intelligently without notice; thus, the status of images must be constantly monitored and identi?ed. The potential attacks on image hashes can be roughly classi?ed as follows: (1) Quantization attack [3,6]. Quantization attacks ?rst divide image into quantizated units such as an image block or an image pixel and then embed hash values in each authentication unit. While embedding hash value in each authentication unit is independent from the contents of others, if two different embedding units have similar hash values, then attack can try to exchange these two units to modify image contents without causing authentication failure. (2) Cryptographic attack [5]. In cryptographic attack, an attacker can collect as many images embedding the same watermark using the same secret key as possible, then try to ?nd out secret message in an image authentication algorithm by the key. (3) Black-box attack [27]. Also known as Oracle, a “block box” uses an authentication needed image as an input and tamper detection results, which include a sign of “success” or “failure,” as an output. When in attack, the image was modi?ed intentionally, was input into

As can be seen from Table 3, the method that used FJLTbased is the most robust one, followed by the methods based on NMF-based, SVD-based, FMT-based, and invariance of the histogram. Worth noting is that those FJLT-based, NMFbased, SVD-based, FMT-based, invariance of the histogram methods can possesses excellent robustness under heavy geometric manipulation (Rotation > 100 , Cropping > 25%, scaling >100%). FJLT-based and NMF-based can tolerate the highest JPEG compression (QF = 1%); while “end-stopped” wavelet, Gaussian randlets, moment and DC of wavelet can achieve certain robustness under slight geometric manipulation (Rotation < 50 , Cropping < 10%, scaling < 100%). Some other methods such as those based on edges and feature points, pair of DCT and histogram are sensitive to most content-preserving manipulations. As for the fragility to content-changing attacks, the methods that used histogram, DWT, Invariance of the histogram and SVD-based are low, they almost can not easily distinguish content-preserving manipulations with other malicious manipulations. The other methods exhibit relatively high fragility to most content-changing attacks.

123

Content-based image authentication Table 3 Comparison of image hashing algorithms in terms of robustness Manipulations Image hash method Content preserving Gaussian noise Statistics based Histogram DWT Moment Relation based Pair of DCT Pair of wavelet Invariance of the histogram Coarse image representations DC of DCT DC of wavelet Gaussian randlets FMT-based SVD-based NMF-based FJLT-based Low-level features Edges End-stopped wavelet NA not applicable NA 0.2 NA 5 NA 10 NA 30 NA 10% NA 60% 100% <98% [2] [15] 0.2 0.2 0.2 0.2 0.4 0.3 0.4 3 3 3 5 5 5 5 15 10 10 10 5 1 1 NA 20 50 100 150 300 400 2% 5% 5% 30% 25% 35% 40% NA 50% 80% 150% 100% 100% 200% 100% 100% 100% 100% <95% 100% 100% [4] [14] [12] [23] [8] [16] [11] NA NA 0.1 NA NA 5 10 10 10 NA NA 200 NA 10% 25% NA 100% 100% 100% 100% <92% [9] [10] [28] 0.2 0.2 0.2 3 3 5 NA 5 10 NA 20 20 10% 10% 5% NA 10% 200% <93% <90% 100% [20] [24] [1] Median ?lter JPEG compression Rotation Cropping Scaling Content changing (true positive rate)

27

Reference

the black-box and the results were observed. Once the output is shown as “success,” it implies such modi?cation will not be detected. After trying a lot of successful modi?cation combinations, attacker can deduce the secret key or other secret message used during watermark embedding, then he can modify image intentionally without causing failure. (4) Feature extraction attack [27]. Such an attack makes use of the shortcoming from extracted feature which is not so complete to fetch the contents of image that people can hardly distinguish between allowed and malicious modi?cations. For example, when image edge is used as the feature, people can create color attack while image edge is kept. When intensity histogram of image is used as the feature, attacker can then try to create a new image with same histogram but different contents.

mechanism should be combined into the hash generation process. In general, the security mechanisms used to protect image hashes against attacks can be roughly classi?ed into four major types (see Fig. 14): direct projection without a key, projection ?rst and then randomize with a key, image randomization with a key and then projection, and random transformation on the projection with a key. (1) Method A: The invariant features are directly extracted by a common image processing manipulation and then the hash value was created. Typically, image edges [2], as well as relation of image feature pair, such as pairs of DCT coef?cients [9] have included basic content of the image. Meanwhile, the principle component direction and value of image are known to be resisted on various image manipulations [1]. However, since those popular feature are well known, they can be easily attacked by any counterfeit attack [3]. (2) Method B: The image was ?rst transformed into robust features, those features are then further randomized by a secret key [11,15,16,24,28]. The security protection of this method is very limited, because those feature values are not protected by a key; thus, attackers can create a new image with a totally different content that outputs the same set of feature values [6,18].

5.2 Security mechanisms The common objective of image hash attacks is to trick the authentication system from detecting the intentional changes of images; specifically, to create a new image with different visual contents, while still preserving the feature values. In order to prevent from such counterfeiting attacks, a security

123

28

S.-H. Han, C.-H. Chu

this transformation lends itself especially dif?cult for an attacker to discover the hash value. Although Randlet Transform has a good security property, its robustness is still limited. 5.3 Measurement of security Security has always been a major design and evaluation criterion for image hashing algorithms. In most image hashing schemes, the output of the feature extraction stage consists of two components—a deterministic part and a random part. The deterministic part is contributed by the image contents; the random part is contributed by the pseudorandom numbers generated using the secret key, by which the degree of uncertainty of random variables can be captured in terms of the differential entropy [23] and unicity distance [13]. Here, we take the differential entropy as a metric to characterize the amount of randomness in hash values. The higher the differential entropy of the hash value, the higher the randomness and the larger the number of exhaustive searches required to forge the hash value. The differential entropy of a continuous random variable X is denoted by ?(x)and given by: ?(x)= f (x) log2 1 dx f (x) (7)

Fig. 14 Security mechanisms on image hashing. a Direct projection without a key. b Projection then randomization with a key. c Randomization with a key then projection. d Transformation projection with a key

(3) Method C: The image blocks are ?rst randomized, they are then projected into a feature space [3,8,11,14,16, 23]. However, since wavelet and Fourier–Mellin transformations both use an orthogonal transformation, an attacker can use statistical learning to ?nd clustering space feature of the hash function and create a totally different new image to launch the attack [17,26]. However, such a transformation may lose local function and therefore, it cannot be used to realize tamper localization for the document. (4) Method D: This method ?rst randomly generates a set of basis functions and then projects them onto an image. Without knowing the basis functions it is hard for an adversary to predict what the projections will be. Mih?ak and Venkatesan [14] used Gaussian and Curvelet wavelet as basis functions and then perform a Randlet Transformation. The randomized nature of

where f (x) is the probability density function of X , and is the range of support of f (x). The schemes that do not have any random components in the feature extraction stage have differential entropy of ?∞ by definition. We compare the security of image hashing schemes in terms of the differential entropy. The results are shown in Table 4. From Table 4, we observe that those schemes employing security mechanism (A) have differential entropy of ?∞, which has lowest randomization; the schemes employing security mechanism (B) or security mechanism (C) have differential entropy more than 5 but less than 100, so they have fair randomization; The scheme employing security mechanism (D) has the greatest differential entropy of 1974, so it can gain high randomization.

6 Performance trade-offs Early authentication methods can be easily attacked by counterfeit attacks, as their embedding watermarks are independent of the content of image [9,25]. The absence of structural information in hash leads to the risk of counterfeit attacks. By block searching and matching, even if a key is unknown by the attacker, when there are two similar watermarks in different embedding units, these units can be exchanged to modify image contents without causing authentication failure [4,5,18]. In Tables 3 and 4, we provide speci?c

123

Content-based image authentication Table 4 Comparison of differential entropy of various hashing schemas Hashing schema [2] [9] [10] [20] [1] [15] [28] [3] [8] [11] [14] [16] [23] [24] [12] D C B Security mechanisms A Differential entropy (DE) ?∞ ?∞ ?∞ ?∞ 5.7 5 4.8 8.3 10 7.2 8 23 16.2 8 1974

29

performance results of different hash algorithms under both content-preserving and content-changing attacks. However, due to the large number of measures used in gauging robustness, it is dif?cult for people to capture its semantic meaning for practical usages. We transform the numerical values into linguistic representations, Low, Fair, and High. The relative
Table 5 Comparison of the performance of image hashing algorithms Performance measures Image hash algorithm Cryptography hash MD5, SHA-1 Statistics based [1] [20] [24] [28] Relation based [9] [10] Coarse representation [4] [8] [11] [12] [14] [16] [23] Low-level feature [2] [15] A B C C B+C D C B+C C A A B A B B D

performance of the security mechanisms are summarized in Table 5. The transfer protocol and detailed results can be found in the “Appendix”. As can be seen from Table 5, there is no perfect image hashing method that can gain all desired performances. Although the methods proposed by Refs. [1,8,11,16,23, 28] have high degree of robustness, their randomization is limited; while those methods proposed by Malkin and Venkatesan [12] can gain high randomization, its robustness and fragility are limited. In general, the four performance related hash properties con?ict with one another. The ?rst property amounts to robustness under small perturbations whereas the second one requires minimization of collision probabilities for perceptually distinct inputs. There is a clearly trade-off here; e.g., if a set of very crude features were used, then there would be hard to change (i.e., robust), but it is likely that one is going to encounter collision of perceptually different images. Likewise for perfect randomization, a uniform distribution on the output hash values (over the key space) would be needed which in general, would deter from achieving the ?rst property. From a security viewpoint, the second and third properties are very important; i.e., it must be extremely dif?cult for the adversary to manipulate the contents of an image and yet obtain the same hash value. It is desirable for the hash algorithm to achieve these con?icting goals to some extent and/or facilitate tradeoffs.
Robustness Fragility Randomization Localization

Security mechanism

Low High Low Fair High Low Fair Fair High High Fair Fair High High Low Fair

High High Low Low High High High High Low High High High High High High Fair

High Fair Low Fair Fair Low Low Fair Fair Fair High Fair Fair Fair Low Fair

No Yes Yes Yes Yes Yes Yes No No No No No No No Yes No

123

30

S.-H. Han, C.-H. Chu

7 Future research directions In this paper, we have discussed the general requirements and features toward designing a multimedia authentication system and presented an in-depth review and analysis on the current image hash technologies for authentication purpose. These technologies have solved some problems and concerns faced in multimedia applications. However, as a new research area, many issues yet remain for further study and exploration. We list and discuss some challenges below that are particularly worthy of further studies: (1) Theoretical framework. Generally speaking image/ media hashing research lacks a rigorous theoretical framework. Existing algorithms employ either cryptographic or signal processing methods. A pure signal processing approach is robust to perceptually insignificant distortions but compromises security, which is desirable in applications for multimedia protection. Likewise pure cryptographic techniques while are secure, completely ignore the requirements of being robust to incidental modi?cations of the media. A theoretical framework that can guide the trade-off among different performance criteria is needed. (2) Performance trade-offs. Previous work on image hashing has focused extensively on the problem of capturing image characteristics but performance trade-offs such as those between perceptual robustness, fragility and randomization of the hash are not explicitly analyzed.

These trade-offs should be addressed via proper parameters in algorithm design. (3) Robustness feature extraction. To achieve robustness and security in image hashing, the most challenging part has been the feature extraction. How to develop a good feature extraction method which is strong on robustness, sensitivity to modi?cation, visual transparency, attack resistance and local error correction capacity is still a big problem. (4) Hash algorithms randomization. Randomization is used either on input signals or on algorithms that are independent of the inputs. For many applications, randomization is insuf?cient. In many cases, how to keep the transformation be perceptually robust, while randomization is still an open problem. (5) Measure of security. To our best knowledge, there is little research that compares the degree of security of image hash functions. A mathematical framework and derived expression for evaluating the security of various common image hashing schemes is desired.

Appendix A: Rating protocol and results In Table 6, we provide the protocol used to transfer numerical performance values into perceived linguistic meanings– low, fair and high. Summary of the detailed results can be found in Table 7, where the robustness meanings are determined based upon the majority of the method’s performance

Table 6 List of transfer protocol

Rating Low

Robustness Rotation < 5o Cropping < 10% Scaling < 50% Compression < 5 Gaussian noise ≤ 0.1 Median ?lter = 0

Fragility True Positive Rate (TPR) ≤95% 95% < TPR < 100%

Randomization DE ≤ 0

Fair

5o ≤ Rotation ≤ 10o 10% ≤ Cropping < 25% 50% ≤ Scaling < 100% 5 ≤ Compression < 10 0.1 ≤ Gaussian noise ≤ 0.3 0< Median ?lter < 5

0 < DE < 100

High

Rotation > 10o Cropping ≥ 25% Scaling ≥ 100% Compression ≥ 10 Gaussian noise > 0.3 Median ?lter ≥ 5

TPR = 100%

DE ≥ 100

123

Content-based image authentication Table 7 Summary of linguistic ratings

31

Hash algorithms [1] [2] [4] [8] [9] [10] [11] [12] [14] [15] [16] [20] [23] [24] [28]

Rotation Low NA NA High NA NA High Fair Low Low High NA Fair Low High

Cropping Low NA Low High NA Fair High Low Low Fair High Fair High Fair High

Scaling High NA NA High NA High High Fair Fair Fair High NA High Low High

Compression High NA High Fair High High Low High High High Low NA High Fair High

Median ?lter High NA Fair High NA NA High Fair Fair High High Fair High Fair High

Gaussian noise Fair NA Fair High NA NA High Fair Fair Fair Fair Fair Fair Fair Low

Robustness rating High Low Fair High Low Fair High Fair Fair Fair High Low High Fair High

NA not applicable

against content preserving attacks. For example, if majority of the ratings are “fair”, then the method will receive a “fair” robustness rating.

References
1. Alghoniemy, M., Tew?k, A.H.: Geometric invariance in image watermarking. IEEE Trans. Image Process 13(2), 145–153 (2004) 2. Dittmann, J., Steinmetz, A., Steinmetz, R.: Content-based digital signature for motion pictures authentication and content-fragile watermarking. Proceedings IEEE International Conference on Multimedia Computing and System, vol. 2, pp. 209–213 (1999) 3. Fridrich, J.: Visual hash for oblivious watermarking. Proceedings SPIE: Security and Watermarking of Multimedia Contents II, vol. 3971, pp. 286–294 (2000) 4. Fridrich, J., Goljan, M.: Robust hash functions for digital watermarking. Proceedings IEEE International Conference on Information Technology: Coding Computing, pp. 178–183 (2000) 5. Holliman, M., Memon, N.: Counterfeiting attacks on oblivious block-wise independent invisible watermarking schemas. IEEE Trans. Image Process. 9(3), 432–441 (2000) 6. Holliman, M., Memon, N., Yeung, M.: On the need for image dependent keys for watermarking. Proceedings of the Second Workshop on Multimedia, NJIT (1999) 7. Kim, H.S., Lee, H.K.: Invariant image watermark using Zernike moments. IEEE Trans. Circuits Syst. Video Technol. 13(8), 766– 775 (2003) 8. Kozat, S.S., Venkatesan, R., Mih?ak, M.K.: Robust perceptual image hashing via matrix invariants. Proceedings IEEE International Conference on Image Processing, Singapore, vol. 5, pp. 3443–3446 (2004) 9. Lin, C.Y., Chang, S.F.: A robust image authentication method distinguishing jpeg compression from malicious manipulation. IEEE Trans. Circuits Syst. Video Technol. 11(2), 153–168 (2001) 10. Lu, C.S., Liao, H.Y.: Structural digital signature for image authentication: An incidental distortion resistant scheme. IEEE Trans. Multimed. 5(2), 161–173 (2003)

11. Lv, X.D., Wang, Z.J.: Fast Johnson–Lindenstrauss transform for robust and secure image hashing. Proceedings IEEE 10th Workshop on Multimedia Signal Processing, October 2008 12. Malkin, M., Venkatesan, R.: The randlet transform: applications to universal perceptual hashing and image authentication. Proceedings Allerton Conference, Monticello, IL (2004) 13. Mao, Y., Wu, M.: Unicity distance of robust image hashing. IEEE Trans. Inf. Forensics Sec. 2(3), 462–467 (2007) 14. Mih?ak, M.K., Venkatesan, R.: New iterative geometric methods for robust perceptual image hashing. Proceedings ACM Workshop Security and Privacy in Digital Rights Management, Philadelphia, PA, November 2001 15. Monga, V., Evans, B.L.: Perceptual image hashing via feature points: performance evaluation and tradeoffs. IEEE Trans. Image Process. 15(11), 3452–3465 (2006) 16. Monga, V., Mihcak, M.K.: Robust and secure image hashing via non-negative matrix factorizations. IEEE Trans. Inf. Forensics Sec. 2(3), 376–390 (2007) 17. Radhakrisnan, R., Memon, N.: On the security of the SARI image authentication system. Proceedings IEEE International Conference on Image Processing, vol. II, pp. 971–974 (2001) 18. Radhakrishnan, R., Xiong, Z., Memom, N.: On the security of the visual hash function. J. Electron. Imaging 14(1), 1–10 (2005) 19. Rey, C., Dugelay, J.L.: A survey of watermarking algorithms for image authentication. EURASIP J. Appl. Signal Process. 6(3), 613–621 (2002) 20. Schneider, M., Chang, S.F.: A robust content based digital signature for image authentication. Proceedings IEEE International Conference on Image Processing, Lausanne, Switzerland, vol. 3, pp. 227–230, September 1996 21. Signal & Image Processing Institute: The USC-SIPI Image Database. University of Southern California. http://sipi.usc.edu/ database/ (2007). Accessed 10 Apr 2007 22. Simitopoulos, D., Koutsonanos, D.E., Strintzis, M.G.: Robust image watermarking based on generalized radon transformations. IEEE Trans. Circuits Syst. Video Technol. 13(8), 732– 745 (2003) 23. Swaminathan, A., Mao, Y., Wu, M.: Robust and secure image hashing. IEEE Trans. Inf. Forensics Sec. 1(2), 215–230 (2006)

123

32 24. Venkatesan, R., Koon, S.M., Jakubowski, M.H., Moulin, P.: Robust image hashing. Proceedings IEEE Conference Image Processing, pp. 664–666, September 2000 25. Wong, P.W.: A public key watermark for image veri?cation and authentication. Proceedings IEEE ICIP, pp. 425–429 (1998) 26. Wu, Y.: On the security of an SVD-based ownership watermarking. IEEE Trans. Multimed. 7(4), 624–627 (2005)

S.-H. Han, C.-H. Chu 27. Wu, J., Zhu, B., Li, S., Lin, F.: Ef?cient Oracle attacks on YengMintzer and variant authentication schemes. Proceedings of the IEEE International Conference on Multimedia & Expo (ICME’04), Taiwan (2004) 28. Xiang, S.J., Kim, H.J., Huang, J.W.: Histogram-based image hashing scheme robust against geometric deformation. The 9th ACM Multimedia and Security Workshop (2007)

123


相关文章:
...Mobilisation in Developing Countries-Issues and Challenges....pdf
税收研究报告Tax Revenue Mobilisation in Developing Countries-Issues and Challenges_电子/电路_工程科技_专业资料。随着经济全球化的深入发展,加强国际税收协调与合作...
Financing Higher Education in Europe Issues and Challenges_....pdf
Issues and Challenges_电子/电路_工程科技_专业资料...based on input criteria and current operational ...
120807-Challenges-and-issues-in-managing-hepatitis-C_图文_....pdf
120807-Challenges-and-issues-in-managing-hepatitis-...status due to lack of diseasespecific symptoms ... sharing injecting equipment among current or ...
Transforming China poverty crisis: Challenges and issues_论文....pdf
Transforming China poverty crisis: Challenges and issues_经济/市场_经管营销_专业资料。 TafrigChiapvrycisChlne n susrnsomn n oet rs:algsdise i ...
Mobile Trends and Issues from 2013-2016_图文.pdf
Mobile Trends and Issues from 2013-2016_英语学习...challenges Test tools can't exercise all app or... Content & Collaboration Summit April 29 May ...
...policy provisions and curriculum issues: The challenges ....pdf
Language policy provisions and curriculum issues: The challenges for secondary schools in Niger_专业资料。Noebr21,lme7No1Sra .2 vme 00Vou ,.1(eilNo7)...
Retirement Planning and Counseling: Issues and Challenges for....pdf
Retirement Planning and Counseling: Issues and Challenges for Teachers in Public Schools in the_专业资料。USCiadctnRve (0 7577.hn uai eiA82 )5-6 Eow...
AAS07_M&AIssues&Challenges_图文.pdf
Mergers and Acquisitions Issues & Challenges -Sim Ng, President of ASHK -Ellick Tsui, Sun Life Hong Kong Ltd -Lawrence Lai, AXA China Region Insurance ...
SGCC issues its challenge to UHV transmission technology_论文....pdf
SGCC issues its challenge to UHV transmission technology_电力/水利_工程科技_专业资料。SGCC issues its challenge to UHV transmission technology ...
...logic programming issues, results and the LLL challenge_....pdf
Inductive logic programming issues, results and the...(LLL) is producing a number of challenges to ...Distribution-based Machine Learning and an EPSRC ...
Catalytic Conversion of Biomass Challenges and Issues.pdf
Challenges and Issues_能源/化工_工程科技_专业资料...It was argued that the use of bio-based ...content, for example, used cooking oils or plant...
...New Challenges for Health Informatics J.UCS Special Issue_....pdf
J.UCS Pervasive Health Management: New Challenges for Health Informatics J.UCS Special Issue J.C. Augusto, N.D. Black, H.G. McAllister, P.J. Mc...
...Issues in Electronic Commerce Krugman's Challenge and the ....pdf
Remarks on Research Issues in Electronic Commerce ... what are the grand challenges for research in ... just the content of the trade), and reason ...
Issues and Challenges for PV Market Development_图文.pdf
Issues and Challenges for PV Market Development_...Current Small Scale Architectural solutions Cooling ...based cells from 14% to 16% - additional ...
Lifelong User Modelling Goals, Issues and Challenges.pdf
Lifelong User Modelling Goals, Issues and Challenges_计算机软件及应用_IT/... based on any 2 Goals which need lifelong user models The current state-...
...Pension Funds in Zimbabwe: Ethical Issues and Challenges_....pdf
Managing Pension Funds in Zimbabwe: Ethical Issues and Challenges_金融/投资_经管营销_专业资料。Chinese Business Review,ISSN 1537-1506 July 2013,Vo1. ...
Digitization Practices in India Issues and Challenges_图文_....ppt
Digitization Practices in India Issues and Challenges_生产/经营管理_经管营销_专业资料。Digitization Practices in India: Issues and Challenges V.N. Shukla C-...
New Hot Issues in the World and Fresh Challenges to Global ....pdf
New Hot Issues in the World and Fresh Challenges to Global Governance_经济/市场_经管营销_专业资料。维普资讯 http://www.cqvip.com NeHotIsw suesit ord ...
Technology in Education - issues, challenges and trends_图文_....ppt
Technology in Education - issues, challenges and trends: an overview Technology in Education Creativity, innovation & learning Analysis of emerging trends ...
Challenges and Solutions for Handoff Issues in 4G Wireless ....pdf
Challenges and Solutions for Handoff Issues in 4G...whenever status of MN’s location information ...current handoff techniques for IP-based 4G mobile...